I’ve had this problem for the second time yesterday. I’ve had this problem once before, but I could not remember how I solved it then. The situation: Application developed and tested on a local Windows Server 2008, MS SQL Server 2008. I deployed the application on a hosted server with SQL Server 2008 Express. The OS is WS2008 R2. I noticed, that I was missing slovene characters, but just one of them. A letter č (c with caret). I tested the local installation, and everything was fine. I opened SQL Server Management studio, and tried to enter slovene characters there. I typed čšž, submitted the query, and what got insterted was cšž. (notice that č was being converted on the fly to c). What is weird is, that this is happening on a hosted machine, and not on my local machine — although both machines use English as a default language.
Since I’ve had this problem before, and I discussed it with Dejan, I started digging through all of my email correspondence with Dejan, but could not find anything. I checked and changed the collation of a database, still nothing. My best friend, Google, was of no help either. It then struck me. Varchar. It is non-unicode data type. nvarchar is of course, what was needed to fix the problem. It is intriguing though, that all of the other Slovene characters work, but c with caret (č) does not get corrupt as one would expect, but gets converted to c.
A lesson learned, is a lesson earned, I guess. This post is here to remind of this situation next time.
btw you can get around that if you set the collation of the column to Slovenian_100_CI_AI
and still have it as varchar.
CREATE TABLE t1 (name varchar(10) collate Slovenian_100_CI_AI)
go
insert into t1 select ‘čćžšđ’ union all select upper(’čćžšđ’)
SELECT * FROM t1
go
drop table t1
the said behaviour happens if you use SQL_* ollations which are not recommended to be used anymore.
Mladen, I’m puzzled. Your code works. Your code, without the collation on the column, works. The default database collation is Slovenian_CI_AS.
What is very interesting, at least to me, is that the error (for me, at least) happened only in hosting scenarios, where database collation is the same as the one on my local machine.
I’d expect š to work but wouldn’t expect ž. Anyway, I always use nXXX when working with multilingual or just Slovene language.
Sure. I vote for nvarchar data type to be default.
try this:
CREATE TABLE t1
(
name varchar(10) collate
–Slovenian_CI_AI
SQL_Latin1_General_CP1_CS_AS
)
go
insert into t1 select ‘čćžšđ’ union all select upper(’čćžšđ’)
SELECT * FROM t1
go
drop table t1
you’ll see that the Č is getting downgraded to C when using SQL_Latin1_General_CP1_CS_AS and not when using Slovenian_CI_AI even though the column is varchar.
as for Nvarchar default this may lead to serious perf degradation on very large db’s because you get twice less data read as you could with varchar.
[Unfortunately] true with regards to perf… That would not be the case, however, if they would’ve used UTF-8 instead of UTF-16…
Anyhow, the main “issue” still stands. Identically configured databases (with regards to collation), but different behavior.