SQL-CLR Type Mismatches
LINQ to SQL automates much of the translation between the object model and SQL Server. Nevertheless, some situations prevent exact translation. These key mismatches between the common language runtime (CLR) types and the SQL Server database types are summarized in the following sections. You can find more details about specific type mappings and function translation at SQL-CLR Type Mapping and Data Types and Functions.
Translation between the CLR and SQL Server occurs when a query is being sent to the database, and when the results are sent back to your object model. For example, the following Transact-SQL query requires two value conversions:
Select DateOfBirth From Customer Where CustomerId = @id
Before the query can be executed on SQL Server, the value for the Transact-SQL parameter must be specified. In this example, the id parameter value must first be translated from a CLR System.Int32 type to a SQL Server INT type so that the database can understand what the value is. Then to retrieve the results, the SQL Server DateOfBirth column must be translated from a SQL Server DATETIME type to a CLR System.DateTime type for use in the object model. In this example, the types in the CLR object model and SQL Server database have natural mappings. But, this is not always the case.
The following types do not have reasonable counterparts.
Mismatches in the CLR System namespace:
Unsigned integers. These types are typically mapped to their signed counterparts of larger size to avoid overflow. Literals can be converted to a signed numeric of the same or smaller size, based on value.
Boolean. These types can be mapped to a bit or larger numeric or string. A literal can be mapped to an expression that evaluates to the same value (for example, 1=1 in SQL for True in CLS).
TimeSpan. This type represents the difference between two DateTime values and does not correspond to the timestamp of SQL Server. The CLR Timespan may also map to the SQL Server TIME type in some cases. The SQL Server TIME type was only intended to represent positive values less than 24 hours. The CLR Timespan has a much larger range.
SQL Server-specific .NET Framework types in System.Data.SqlTypes are not included in this comparison.
Mismatches in SQL Server:
Fixed length character types. Transact-SQL distinguishes between Unicode and non-Unicode categories and has three distinct types in each category: fixed length nchar/char, variable length nvarchar/varchar, and larger-sized ntext/text. The fixed length character types could be mapped to the CLR System.Char type for retrieving characters, but they do not really correspond to the same type in conversions and behavior.
Bit. Although the bit domain has the same number of values as Nullable<Boolean>, the two are different types. Bit takes values 1 and 0 instead of true/false, and cannot be used as an equivalent to Boolean expressions.
Timestamp. Unlike the CLR Timespan type, the SQL Server TIMESTAMP type represents an 8-byte number generated by the database that is unique for each update and is not based on the difference between DateTime values.
Money and SmallMoney. These types can be mapped to Decimal but are basically different types and are treated as such by server-based functions and conversions.
There are many SQL Server data types that you can map to one or more CLR data types. There are also many CLR types that you can map to one or more SQL Server types. Although a mapping may be supported by LINQ to SQL, it does not mean that the two types mapped between the CLR and SQL Server are a perfect match in precision, range, and semantics. Some mappings may include differences in any or all of these dimensions. You can find details about these potential differences for the various mapping possibilities at SQL-CLR Type Mapping.
User-defined CLR types are designed to help bridge the type system gap. Nevertheless they surface interesting issues about type versioning. A change in the version on the client might not be matched by a change in the type stored on the database server. Any such change causes another type mismatch where the type semantics might not match and the version gap is likely to become visible. Further complications occur as inheritance hierarchies are refactored in successive versions.
In addition to the pairwise mismatch between CLR and database types, expressions add complexity to the mismatch. Mismatches in operator semantics, function semantics, implicit type conversion, and precedence rules must be considered.
The following subsections illustrate the mismatch between apparently similar expressions. It might be possible to generate SQL expressions that are semantically equivalent to a given CLR expression. However, it is not clear whether the semantic differences between apparently similar expressions are evident to a CLR user, and therefore whether the changes that are required for semantic equivalence are intended or not. This is an especially critical issue when an expression is evaluated for a set of values. The visibility of the difference might depend on data- and be hard to identify during coding and debugging.
SQL expressions provide three-valued logic for Boolean expressions. The result can be true, false, or null. By contrast, CLR specifies two-valued Boolean result for comparisons involving null values. Consider the following code:
-- Assume col1 and col2 are integer columns with null values. -- Assume that ANSI null behavior has not been explicitly -- turned off. Select … From … Where col1 = col2 -- Evaluates to null, not true and the corresponding row is not -- selected. -- To obtain matching behavior (i -> col1, j -> col2) change -- the query to the following: Select … From … Where col1 = col2 or (col1 is null and col2 is null) -- (Visual Basic 'Nothing'.)
A similar problem occurs with the assumption about two-valued results.
-- Assume col1 and col2 are nullable columns. -- Assume that ANSI null behavior has not been explicitly -- turned off. Select … From … Where col1 = col2 or col1 != col2 -- Visual Basic: col1 <> col2. -- Excludes the case where the boolean expression evaluates -- to null. Therefore the where clause does not always -- evaluate to true.
In the previous case, you can get equivalent behavior in generating SQL, but the translation might not accurately reflect your intention.
LINQ to SQL does not impose C# null or Visual Basic nothing comparison semantics on SQL. Comparison operators are syntactically translated to their SQL equivalents. The semantics reflect SQL semantics as defined by server or connection settings. Two null values are considered unequal under default SQL Server settings (although you can change the settings to change the semantics). Regardless, LINQ to SQL does not consider server settings in query translation.
A comparison with the literal null (nothing) is translated to the appropriate SQL version (is null or is not null).
The value of null (nothing) in collation is defined by SQL Server; LINQ to SQL does not change the collation.
SQL supports a rich set of implicit conversions in expressions. Similar expressions in C# would require an explicit cast. For example:
Nvarchar and DateTime types can be compared in SQL without any explicit casts; C# requires explicit conversion.
Decimal is implicitly converted to DateTime in SQL. C# does not allow for an implicit conversion.
Likewise, type precedence in Transact-SQL differs from type precedence in C# because the underlying set of types is different. In fact, there is no clear subset/superset relationship between the precedence lists. For example, comparing an nvarchar with a varchar causes the implicit conversion of the varchar expression to nvarchar. The CLR provides no equivalent promotion.
In simple cases, these differences cause CLR expressions with casts to be redundant for a corresponding SQL expression. More importantly, the intermediate results of a SQL expression might be implicitly promoted to a type that has no accurate counterpart in C#, and vice versa. Overall, the testing, debugging, and validation of such expressions adds significant burden on the user.
Transact-SQL supports explicit collations as annotations to character string types. These collations determine the validity of certain comparisons. For example, comparing two columns with different explicit collations is an error. The use of much simplified CTS string type does not cause such errors. Consider the following example:
create table T2 ( Col1 nvarchar(10), Col2 nvarchar(10) collate Latin_general_ci_as )
Select … From … Where Col1 = Col2 -- Error, collation conflict.
In effect, the collation subclause creates a restricted type that is not substitutable.
Similarly, the sort order can be significantly different across the type systems. This difference affects the sorting of results. Guid is sorted on all 16 bytes by lexicographic order (IComparable()), whereas T-SQL compares GUIDs in the following order: node(10-15), clock-seq(8-9), time-high(6-7), time-mid(4-5), time-low(0-3). This ordering was done in SQL 7.0 when NT-generated GUIDs had such an octet order. The approach ensured that GUIDs generated at the same node cluster came together in sequential order according to timestamp. The approach was also useful for building indexes (inserts become appends instead of random IOs). The order was scrambled later in Windows because of privacy concerns, but SQL must maintain compatibility. A workaround is to use SqlGuid instead of Guid.
Operators and functions that are essentially comparable have subtly different semantics. For example:
C# specifies short circuit semantics based on lexical order of operands for logical operators && and ||. SQL on the other hand is targeted for set-based queries and therefore provides more freedom for the optimizer to decide the order of execution. Some of the implications include the following:
Semantically equivalent translation would require "CASE … WHEN … THEN" construct in SQL to avoid reordering of operand execution.
A loose translation to AND/OR operators could cause unexpected errors if the C# expression relies on evaluation the second operand being based on the result of the evaluation of the first operand.
Round() function has different semantics in .NET Framework and in T-SQL.
Starting index for strings is 0 in the CLR but 1 in SQL. Therefore, any function that has index needs index translation.
The CLR supports modulus (‘%’) operator for floating point numbers but SQL does not.
The Like operator effectively acquires automatic overloads based on implicit conversions. Although the Like operator is defined to operate on character string types, implicit conversion from numeric types or DateTime types allows for those non-string types to be used with Like just as well. In CTS, comparable implicit conversions do not exist. Therefore, additional overloads are needed.
This Like operator behavior applies to C# only; the Visual Basic Like keyword is unchanged.
Overflow is always checked in SQL but it has to be explicitly specified in C# (not in Visual Basic) to avoid wraparound. Given integer columns C1, C2 and C3, if C1+C2 is stored in C3 (Update T Set C3 = C1 + C2).
create table T3 ( Col1 integer, Col2 integer ) insert into T3 (col1, col2) values (2147483647, 5) -- Valid values: max integer value and 5. select * from T3 where col1 + col2 < 0 -- Produces arithmetic overflow error.
SQL performs symmetric arithmetic rounding while .NET Framework uses banker’s rounding. See Knowledgebase article 196652 for additional details.
By default, for common locales, character-string comparisons are case-insensitive in SQL. In Visual Basic and in C#, they are case-sensitive. For example, s == "Food" (s = "Food" in Visual Basic) and s == "Food" can yield different results if s is food.
-- Assume default US-English locale (case insensitive). create table T4 ( Col1 nvarchar (256) ) insert into T4 values (‘Food’) insert into T4 values (‘FOOD’) select * from T4 where Col1 = ‘food’ -- Both the rows are returned because of case-insensitive matching.
Operators/ functions applied to fixed length character type arguments in SQL have significantly different semantics than the same operators/functions applied to the CLR System.String. This could also be viewed as an extension of the missing counterpart problem discussed in the section about types.
create table T4 ( Col1 nchar(4) ) Insert into T5(Col1) values ('21'); Insert into T5(Col1) values ('1021'); Select * from T5 where Col1 like '%1' -- Only the second row with Col1 = '1021' is returned. -- Not the first row!
A similar problem occurs with string concatenation.
create table T6 ( Col1 nchar(4) Col2 nchar(4) ) Insert into T6 values ('a', 'b'); Select Col1+Col2 from T6 -- Returns concatenation of padded strings "a b " and not "ab".
In summary, a convoluted translation might be required for CLR expressions and additional operators/functions may be necessary to expose SQL functionality.
In C# and in SQL, users can override the default semantics of expressions by using explicit type casts (Cast and Convert). However, exposing this capability across the type system boundary poses a dilemma. A SQL cast that provides the desired semantics cannot be easily translated to a corresponding C# cast. On the other hand, a C# cast cannot be directly translated into an equivalent SQL cast because of type mismatches, missing counterparts, and different type precedence hierarchies. There is a trade-off between exposing the type system mismatch and losing significant power of expression.
In other cases, type casting might not be needed in either domain for validation of an expression but might be required to make sure that a non-default mapping is correctly applied to the expression.
-- Example from "Non-default Mapping" section extended create table T5 ( Col1 nvarchar(10), Col2 nvarchar(10) ) Insert into T5(col1, col2) values (‘3’, ‘2’);
Select * From T5 Where Col1 + Col2 > 4 -- "Col1 + Col2" expr evaluates to '32'
Accounting for some SQL Server-CLR type differences may resut in a decrease in performance when crossing between the CLR and SQL Server type systems. Examples of scenarios impacting performance include the following:
Forced order of evaluation for logical and/or operators
Generating SQL to enforce order of predicate evaluation restricts the SQL optimizer’s ability.
Type conversions, whether introduced by a CLR compiler or by an Object-Relational query implementation, may curtail index usage.
-- Table DDL create table T5 ( Col1 varchar(100) )
Consider the translation of expression (s = SOME_STRING_CONSTANT).
-- Corresponding part of SQL where clause Where … Col1 = SOME_STRING_CONSTANT -- This expression is of the form <varchar> = <nvarchar>. -- Hence SQL introduces a conversion from varchar to nvarchar, -- resulting in Where … Convert(nvarchar(100), Col1) = SOME_STRING_CONSTANT -- Cannot use the index for column Col1 for some implementations.
In addition to semantic differences, it is important to consider impacts to performance when crossing between the SQL Server and CLR type systems. For large data sets, such performance issues can determine whether an application is deployable.