Home > PostgreSQL, SQL Server > Small differences between SQL Server and PostgreSQL

Small differences between SQL Server and PostgreSQL

In my copious amount of spare time Smile, I am currently working with Gapminder to build a data warehouse. We are using PostgreSQL and Ruby Rails as the development platform.

As I learn PostgreSQL, I am running into some interesting differences. For the SQL Server/PostgreSQL people out there, I thought it might be interesting to describe some of them in case you are transitioning from one to the other.

 

SQL Server PostgreSQL
INSERT t VALUES (…) This syntax is not allowed. Allows:
INSERT INTO t VALUES (…)
BULK INSERT and BCP uses COPY instead
(which has the functionality of both BCP and BULK INSERT)
Management Studio pgAdmin
Bit type Boolean type (accepts values true and false)
IDENITTY Has sequencers (like Oracle)
default schema is dbo default schema is PostgreSQL
Default Listening on 1433 Default listening on 5432
datatype: varchar(max) datatype: text
Key is clustered by default key is not clustered by default (and it is enforced by a constraint and not an an index!)
User Defined Data Types Domains
user: sa user: postgres
No such thing NATURAL and USING joins
SELECT TOP 10 * FROM t SELECT * FROM t LIMIT 10
Query plans read from right to left Query plan read from left to right
Estimate Query Plan: CTRL+L Estimate Query Plan: F7
Advertisements
  1. May 20, 2011 at 08:31

    pgAdmin is just one tool, there are many others available.

    You can create new tables (and other objects) using the GUI, no problem: => edit => new object => Go! The schema “PostgreSQL” doesn’t exist, I think you mean “public”. For User Defined Data Types in PostgreSQL, you can create a DOMAIN but also a TYPE. A DOMAIN is an extension to a current datatype using check constraints, a TYPE can be something completely new. Just check the manual.

    • Thomas Kejser
      May 29, 2011 at 13:06

      Thanks Frank

      I am still learning this thing better and will update the post. I played around with the edit function and as you can see I updated the post (will update again next time I code).

  2. May 20, 2011 at 15:30

    I played with PostgreSQL a little I and really like it. Has a very rich feature set!

  3. Thomas Kejser
    May 20, 2011 at 16:30

    I was surprised at the elegance of PostgreSQL. it is a very “clean” implementation of a relational database (unlike MySQL, which is a mess)

  4. May 20, 2011 at 16:58

    Yes, it’s really well done. Do you also tried to play with advanced features like table inheritance? So cool.

    • Thomas Kejser
      May 29, 2011 at 13:01

      Did not try table inheritance yet. I like the “clean” implementation of the engine – very back to basics relational

  5. July 4, 2011 at 01:06

    I feel like using TOP in SQL Server doesn’t match the functionality of LIMIT where LIMIT has both offset and row count. How would one try to e.g fetch rows 10-20 in SQL Server?

    • Thomas Kejser
      July 4, 2011 at 01:15

      That depends on what rows 10-20 means?… The moment you dont just fetch “some amount” you have implied ordering of the dataset. If you want that functionality in SQL Server, you will typically use the ROWCOUNT function and add a WHERE critieria on it. I actually think that is a much cleaner implementation than the LIMIT

      • July 4, 2011 at 15:26

        Lets say I have a list of names spanning 10 000 rows.
        How would I retrieve row 2 000 to 3 000?
        It’s useful when creating a pagination system where you wouldn’t want to fetch all results and only list a 1 000 of them.

        Page 1 : row 0-1000
        Page 2 : row 1001-2000
        Page 3 : row 2001-3000

        And so on.

  6. Thomas Kejser
    July 4, 2011 at 15:59

    Svante: My point above was exactly: “What do you mean when you say rows 2000 to 3000?”. Recall that in relational algebra, tuples have no order (the fact that we sometimes index them is a physical structure we build to assist queries, but no ordering is inherent in the model). Only by implying ordering in a query does it make sense to talk about a specific row offset, but this offset will vary by query and is not an attribute of table.

    Example: rows 1000-2000 are undefined for a statement like this:
    SELECT A,… FROM Table

    But defined for a statement like this:
    SELECT A, … FRMO Table ORDER BY A

    In the last case, you can use the SQL Server ROWCOUNT function to retrieve a specific offset of rows (because it needs the rows to be ordered to assign a rownumber to them).

  7. July 17, 2013 at 13:10

    Another difference is cost. You get one for free
    Regarding features – We get almost all the feature free of cost in postgrest plus a lot more high performance setup for most business requirements

    Regards
    Sreeni
    http://www.sreenivaskandakuru.com

    • July 17, 2013 at 14:36

      Sreenik: have you ever actually benchmarked PostGres against MSSQL? I have… If you had – you would realise that PostGres still has a very long way to go to get good performance. MySQL is somewhat better (but with a horrible syntax).

      You get what you pay for…

      • July 18, 2013 at 09:06

        Thomas,

        You know that MSSQL cannot run on linux whereas postgres sql can run both on linux and windows. To do benchmarking we need to run in like to like enviroments. postgres sql works best running of linux where for mssql there is only one option

        Having said that redhat and microsoft claim they are the best.
        I have a link from redhat http://www.redhat.com/pdf/rhel/bmsql-postgres-sqlsrvr-v1.0-1.pdf

        Regards
        Sreeni

      • Thomas Kejser
        July 18, 2013 at 09:32

        Yes.. I benchmarked PostGres on Linux vs. MSSQL on Windows. The problem is the same, PostGres has really poor scale. Running an OSS Operating System does not magically give you scale – poor data structures are poor data structures. I dont see “it runs on multiple operating systems” as an advantage for a database engine. Who gives a damn what OSS it runs on as long as it delivers the throughput and scale.

        Interesting link by the way, i have run exactly this test on a MSSQL and the result is 250K Tx/min on SQL Server without any tuning. Maybe I got that number because I didn’t cheat like this test did:

        “The throughput demonstrates scaling only to the point where the network becomes the bottleneck for the client-server remote connections. ”

        So we can conclude that the only thing the Redhat test measure is the efficiency of the driver stack – the results show nothing about scalability at all. Since they use the JDBC driver, I am not surprised to see that the Linux version is faster at low speed and scale. Windows people generally dont use Java, because they have better alternatives, so the Java driver stack does not see as much love as other connectivity libraries.

        Even if you DID remove this network bottleneck, this test is showing PostGres in the best possible light because the internal locks in PostGres scale very poorly (and the memory manager sucks) and this hurts less on a 2 socket than on a bigger machine. Now, try to run the same test on a 4 socket and you will discover the Postgres completely falls over while SQL Server just keeps delivering.

        Please apply some critical thinking before spouting the OSS religion…

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s