Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
SQLAlchemy 2.0 Released (sqlalchemy.org)
264 points by zzzeek on Jan 26, 2023 | hide | past | favorite | 82 comments


I would urge people who have had issues with the documentation to give the 2.0 documentation a try. Many aspects of it have been completely rewritten, both to correctly describe things in terms of the new APIs as well as to modernize a lot of old documentation that was written many years ago.

First off, SQLAlchemy's docs are pretty easy to get to, for a direct link just go to:

https://docs.sqlalchemy.org/

It's an esoteric URL I know! ;)

from there, docs that are new include:

- the Quickstart, so one can see in one quick page what SQLAlchemy usually looks like: https://docs.sqlalchemy.org/en/20/orm/quickstart.html

- the Unified Tutorial, which is a dive into basically every important concept across Core / ORM : https://docs.sqlalchemy.org/en/20/tutorial/index.html

- the ORM Querying guide, which is a "how to" for a full range of SQL generation: https://docs.sqlalchemy.org/en/20/orm/queryguide/index.html

it's still a lot to read of course but part of the idea of SQLAlchemy 2.0 was to create a straighter and more consistent narrative, while it continues to take on a very broad-based problem space. If you compare the docs to those of like, PostgreSQL or MySQL, those docs have a lot of sections and text too (vastly more). It's a big library.


SQLAlchemy and your efforts are so amazing and appreciated. Thank you and the team for all of your hard work on 2.0 as well as the heroic effort that was put into 1.4 as a stepping stone for existing codebases!


Hey Mike! :) Hope all is well.


So confused after reading unified tutorial. It's so confusing and too many styles to declare ORM

>>> class User(Base): ... __tablename__ = "user_account" ... ... id: Mapped[int] = mapped_column(primary_key=True) ... name: Mapped[str] = mapped_column(String(30)) ... fullname: Mapped[Optional[str]] ... ... addresses: Mapped[List["Address"]] = relationship(back_populates="user") ... ... def __repr__(self) -> str: ... return f"User(id={self.id!r}, name={self.name!r},

VS

class User(Base): __tablename__ = "user_account"

    id = mapped_column(Integer, primary_key=True)
    name = mapped_column(String(30), nullable=False)
    fullname = mapped_column(String)

    addresses = relationship("Address", back_populates="user")

    # ... definition continues

Vs

user_table = Table( "user", mapper_registry.metadata, Column("id", Integer, primary_key=True), Column("name", String(50)), Column("fullname", String(50)), Column("nickname", String(12)), )

Vs

class Customers(Base): __tablename__ = 'customers'

   id = Column(Integer, primary_key = True)
   name = Column(String)
   address = Column(String)
   email = Column(String)
Why so many styles? Why can't be like pydantic or Django and use just one style for ORM?

it's breaking zen of python, it makes searching examples on Google very hard ( esp with Google getting really crappy)


> Why can't be like pydantic

Pydantic also has multiple styles, in fact, I think it has more styles than SQLAlchemy.

> and use just one style for ORM?

To be fair, one of the “styles” you show is not ORM, its the lower-level Core which can be used without the ORM and which the ORM is built on top of. And the two ORM styles include an older one largely for backward compatibility and a more modern one.


Then one really simple question: what to use ?


the current tutorial and quickstart guide should make it exceedingly clear, how is it that people are still confused?

this is the quickstart:

https://docs.sqlalchemy.org/en/20/orm/quickstart.html

that's the one you use. That's it! There is no other style listed there

lots of people still want to use other styles or still have to because they have legacy code. So these are documented also. However, SQLAlchemy wants you to use the style at:

https://docs.sqlalchemy.org/en/20/orm/quickstart.html

that's the only style we would have if we were starting now.


My recommendation is to avoid using an ORM (SQLAlchemy or any other) to define the business logic models (which is the approach used in Django). That works well only if your business models and your database entities align very closely, but in many large systems this is not the case, end you end up running around in circles trying to align the two.

Instead, use either an ORM, or even better the SQLALchemy core, to manage and abstract out the database; and use something like dataclasses, attrs or Pydantic to define your business logic models. In a network-oriented application, i.e. an API, I would recommend to use a separate tool to model the I/O entities, again unless they correspond very closely to the business ones.


So build your own application-specific bespoke ORM from dataclasses rather than using the one sitting right there?


When that pattern is done well, and it can be, it’s that you have a layer that doesn’t care that stuff is stored in a database. It’s lots of logic. That layer uses tools like dataclasses.

Then you have some layer that stores things in the database. It’s like a record keeper. It uses an ORM or not.

One benefit of this is ORMs are giant, complex objects by necessity. It’s good to only use them where they’re needed, at the database. Then you don’t have all their magic floating through your business code. They’re also rather opinionated. I, at least, experience whay more freedom when making a class from scratch than using an ORM.

This comes at a cost of maintaining two versions of lots of things and a translation layer in between. It is overkill for super CRUDy stuff (most apps).


If you are doing ORM on a greenfield project, the declarative style.


Which declarative style? Mapped or columns?


Columns. Even early on many years ago, I ignore all the exotic forms and different ways. Leave those for when you need to do something non-standard or programmatic.

Choose a form that works for you and looks nice, apply it to all your oddest use cases like many to many relationships and self-referential relationships, and then use that as a template going forward.


I can't see the difference between the first two "VS" examples you have above, it just looks like they are formatted differently (oh, one has annotations and the other doesn't. you'd use the annotated version. I guess you are cut-and-pasting from a migration guide that's for legacy code. "migration guide" means "I have legacy code". if you don't have legacy code, you have nothing to migrate. Go straight to https://docs.sqlalchemy.org/en/20/orm/quickstart.html ).

The second example is a Core table. The new tutorial makes it clear that this is an underlying element that nonetheless you have to sometimes create directly.

the third example is gone. That style is not in the docs anymore *except* in a clearly marked side-box that is talking about a legacy version of SQLAlhcemy - you don't have to read that and the box IMO makes it as clear as it can that this is not the current way of doing things.

> Why so many styles? Why can't be like pydantic or Django and use just one style for ORM?

pydantic is extremely new and SQLAlchemy's declarative style pre-dates Python 3 entirely. So we have changed it. The old ways of doing things have to be maintained for backwards compatibility with millions of existing applications. You can't be around for 16 years and have exactly one way of doing things that never changes, legacy patterns for large software libraries are a fact of life.

as for django I dont know what they are doing, I am sure things had to change for them to accommodate pep-484 typing in models (if they have even done that).


Great tool. I think this is why folks have trouble finding the docs:

Whole entire site is docs everywhere, including home page. Links to "docs" are tiny on the home page. Main menu calls them "library" instead of docs, why?

Majority of the stuff is focused on migration rather than new users. These two should be separated into separate tracks immediately. There is an attempt, but it gets lost somehow.

Also the release page linked here and "what's new in 2.0" go straight into the deep end, "inside baseball" style. These kind of pages should be very light. Tons of history-I'm personally not interested when trying to get something done. Unsolicited advice, move that into another blog post. ;-)


> Whole entire site is docs everywhere, including home page. Links to "docs" are tiny on the home page.

you got it, look now. If you can't find the docs now, I dont know what to say.

> Main menu calls them "library" instead of docs, why?

A library is where you'd go to learn...but I guess, because the product is a "library", that's why it's not obvious? The term is gone from the nav etc. the sub-page is still called "library" but nobody needs to go there.


The link I posted here is the blog post. That's why it's historical and chatty, it's the blog post.


Love your tools!

I think you should consider changing "library" to "documentation" - for whatever reason I also struggled to discover "library" actually meant docs - but only after I also clicked on a specific version.

I think the best UI/UX for this is something like django's [0], where "documentation" brings you straight to the latest version's documentation.

[0] https://www.djangoproject.com/


> I think you should consider changing "library" to "documentation" - for whatever reason I also struggled to discover "library" actually meant docs - but only after I also clicked on a specific version.

you got it, look now

> I think the best UI/UX for this is something like django's [0], where "documentation" brings you straight to the latest version's documentation.

the design of our main docs index page was taken completely from how Django does it (e.g. how they have "paragraphs of links" at https://docs.djangoproject.com/en/4.1/)


Oh yeah I love it!

I don't know why but it's just way more intuitive this way for me at least, thanks!


As a first time reader of these docs, they don't seem at all approachable.

For example: for the quickstart, the premise is simple - show me quickly how I can create a class, map it to the database, create some examples, and delete them. Something like what mongoose does here: https://mongoosejs.com/docs/

Instead, SQLAlchemy goes from how to declare models (with a wall of code followed by a wall of text), jumps into creating "engines", shows some generated SQL code for creating tables, and so on.

Sessions are simply used without being introduced, and many questions arise - am I hoarding a database connection as long as I am holding onto the session? Why should I do anything other than executing queries while that is happening? Then that means the `with` expression is useless, unless I want to explicitly run a multiple-query transaction - but why should this be the default?

It also comes across as clunky to be importing a `select` statement at the top level. Why not call select on the table objects themselves? And what's up with `select(...).select_from(...)`?


Just want to thank you for all the work you do! We are heavy users of sqlalchemy, our life would be much harder without it!


youre welcome, glad it is of help


Pandas had a docs sprint awhile back. Are the DOCs issues labeled?


This has been a long time coming. Many congratulations.


Does it fully support types?

Edit:

Oh, it seems to, that's big. For comparison, Hibernate, the Java mammoth ORM, as far as I know, never fully moved to Java 5 generics (released 18 years ago).

Also dataclasses and enums. Also big, since there used to be a lot of duplication in configuration, SQLAlchemy doing its thing and the rest of the codebase just using now standard Python 3 features.

Cool!


Not only are the ORM objects typed, the query results are typed too. Last time I checked on 2.0 progress there wasn't a feasible way to implement this, so I'm thrilled to see that it made the release.

    # (variable) stmt: Select[Tuple[int, str]]
    stmt = select(User.id, User.name)
    
    with Session(e) as sess:
        for row in sess.execute(stmt):
            # (variable) row: Row[Tuple[int, str]]
            print(row)
    
        # (variable) users: Sequence[User]
        users = sess.scalars(select(User)).all()
    
        # (variable) users_legacy: List[User]
        users_legacy = sess.query(User).all()
https://docs.sqlalchemy.org/en/20/changelog/whatsnew_20.html...


It’s honestly something of a marvel to see a big ol’ framework like SQLAlchemy successfully manage these big structural api changes. The changes that have lead up to v1.4 and v2.0 have been very significant improvements to the user interface, and have kept the library up to date with the language in a way that is rare to see for a code base of such complexity. Also the new unified bulk insert interface is a big win for my ETL workflows.

Hats of to everyone be involved!


I learned the hard way that I need to always use SQLAlchemy. I used other Python ORMs before and has been bitten hard with random bugs.


I will always stand behind SQLAlchemy being one of the best designed ORMs around. It generates some of the best backend SQL for the chosen dialect and it’s separation behind low level “core” concepts and high level ORM concepts makes for the cleanest separation of concerns. Even if it can’t generate the best SQL for you, it’s bog-simple to fiddle with the core components to hint it in the proper direction.


Yeah, ORMs are ten thousand edge cases, it really lays to use a battle hardened one that has been lovingly maintained and improved constantly for over a decade.

Hats off to Mike, most projects that are this age are decrepit and you have kept it fresh.



Love the fact that Release notes (https://www.sqlalchemy.org/blog/2023/01/26/sqlalchemy-2.0.0-...) give decent credits to SQLModel (https://sqlmodel.tiangolo.com/) for some of the typing innovations.


I was just working on sqlalchemy. zzzeek's help if you get stuck somewhere is outstanding. Async is a godsent feature for us.


Out of curiosity, what's the benefit of async for you?


Async sqlalchemy is required if we want to use an async appserver(tornado). We can serve a lot more requests per second with async. Otherwise, tornado was sync and could only serve one request synchronously at one time.


Thanks:)


Non-blocking I/O is essential for high throughput, but high latency internet native databases such CockroachDB.


I would request an experience sqla developer to write a blog post, tutorial or even a paid course for new sqla, doing general queries and necessary db operation, using 2.0 specific, best ORM way of doing things ( only one ORM style).

And it should become default go-to doc for every new comers


SQLAlchemy's lightweight SQL wrapping was a revelation when I came across it, but I've come to rely a lot on Django's migration tooling, enough so to choose Django for projects. What do people using SQLAlchemy use for migrations?


Alembic is the most common tool paired with SQLAlchemy.


Alembic is great. It will generate most schema changes from your updated models' code. I even use it programmatically in one app, to automatically migrate the SQLite database on the user's machine, and it has been extremely reliable.


I tried Alembic a long time ago, and it was a bit unstable then, but it was long enough ago that I seem to remember it being new, so maybe it's fixed up now. I'll give it another look.


Alembic


I just asked ChatGPT "How can I migrate database structures using SQLAlchemy?" and it gave an amazing answer. Alembic with examples.


If you use pandas with SQLAlchemy, you should be aware that pandas might not be fully compatible with SQLAlchemy 2.0 yet.

Some github issues:

https://github.com/pandas-dev/pandas/issues/40686

https://github.com/pandas-dev/pandas/issues/40460

I'm not actively following how compatible pandas has become, but you should absolutely test your code if you want to upgrade to 2.0.


Congratulations on the release. I've been a happy user of SQLAlchemy for over a decade, and I'm looking forward to giving this version a spin.


For a decade I avoided sqlalchemy, we finally got to use it last year and found it quite overwhelmingly complex and 1.4 documentation is messy. We had to use mixed styles.

I checked unified doc and it had improved but , still wall of texts.. quite hard to read. Readability count's.


Absolutely agree, coming from the Node.js world documentation for Python packages (especially more established ones) is largely unapproachable, wording is unnecessarily obtuse and descriptions go on for too long. Code is presented in walls instead of being elegantly interspersed with short descriptions.


What do you use instead?


Now we're using Django for current project. But for new project we planned to use Starlite framework so we are still deciding 1. PicoloORM 2. Sqlalchemy 2.x

And now finding sqlalchemy 2.0 still way too complicated


Have you looked at peewee? It might be just what you're looking for.

http://docs.peewee-orm.com/


It doesn't matter how good a library is, if care is not taken to make the documentation clean, accessible, elegant, concise, its legs are cut off from the start.

Documentation authors should get out of their own heads while writing, and try as hard as possible to attain the perspective of a complete newcomer who will have no idea why a library needed to be split into Core and ORM in the first place.


From https://docs.sqlalchemy.org/en/20/dialects/ :

> Currently maintained external dialect projects for SQLAlchemy include: [...]

Is there a list of async [SQLA] DB adapters?

The SQLAlchemy 2.0 Release Docs: https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.htm...


Incredible work, the best python database tool just got even better!


Sorry for detracting from the release news but SQLAlchemy has to have the least helpful website of any major library I can think of. I'm trying to find basic 101 code examples and it's just one wall of text after another. I found how to cite SQLAlchemy in a research paper and have yet to find a single code example.


Under the Current Releases section, click the docs link then ORM Quick Start. But it's tough to find at a glance.


Yes, that's probably my biggest complaint about the library. The docs are there but hard to navigate. Sometimes crucial tips are buried in an aside on one of the many pages that talk about the same topic. It's difficult to get started and it's also difficult to find the ultimate explanation of what's going on.


Too much repetition in docs . The framework might be good but the doc is lacking simplicity. I still can't get a grasp of what going on while reading.


I miss ActiveModel and ActiveRecord so much (after recently switching jobs to a python shop), docs, api etc were just so polished and battle hardened (yes they have their problems, but I was pretty proficient in using them well)


Check out pugsql, we switched from SQLAlchemy to that because the former was overkill, and way too complex.


I find ORM models useful for type checking and migrations, what is your experience for this with pugsql?


At the risk of sounding flippant (which I promise is not the intention), if I care about type safety I don’t use python.


Why? Python 3.8+ have perfect typing


Simply because other languages such as haskell are much, much better at that.


peewee is also pretty good


It's unmaintained..


I generally try StackOverflow/Google, then documentation and then discussion issues on SQLAlchemy. IT has worked well for me.


When I see such a poor website, I also expect the codebase to be clunky and cluttered.

Better alternative: - https://pugsql.org/


I've worked for close to a decade with Django's ORM and recently with TypeORM, both have been simple to use and generally (minus some weird things I tried to do) a pleasure. I recently started working with a not so important project at my current job with sqlalchemy because I had to use Flask, I cannot describe the pain I've felt working with this, I dread having to write yet another query with sqlalchemy. The docs are terrible, some of the worst I've ever seen, an obscure and undocumented government library I'm also using at work has been easier to learn by reading its code than sqlalchemy. Someone above mentioned that the new 2.0 docs are better, I seriously hope they are, they will make my suffering more tolerable. However, even if the docs are good the API is still the worst and least intuitive I've ever seen, it honestly feels like I'm writing raw sql code shaped like python code. But it is weird, non standard and difficult to follow, unlike the official python sql interface. I recently had to write an upset and it felt like I was trying to summon a forgotten demon. It would have honestly been much easier to just write it in raw sql.

I'm sorry for the wall of text, I wanted to let you know that I'm very grateful for you suggesting that library. I hope I can migrate this (still) relatively small codebase away from sqlalchemy. I'm going to give pugsql and peewee a try, both have been mentioned in this thread as good alternatives.

I can summarize my basic complain here: the abstraction layer that sqlalchemy provides is more complex than SQL itself. It's almost not worth it, add to that the docs are a huge mess and you get something only diehards and people who have been using it for decades want to use.


> Someone above mentioned that the new 2.0 docs are better, I seriously hope they are, they will make my suffering more tolerable. However, even if the docs are good the API is still the worst and least intuitive I've ever seen

Thats Exactly my experience using SqlAlchemy on previous project , which i choose to do with Starlite framework. The pain is horrible. I don't want to even touch writing another query , even if it is a simple CRUD .

>However, even if the docs are good the API is still the worst and least intuitive I've ever seen, it honestly feels like I'm writing raw sql code shaped like python code. But it is weird, non standard and difficult to follow, unlike the official python sql interface. I recently had to write an upset and it felt like I was trying to summon a forgotten demon. It would have honestly been much easier to just write it in raw sql.

At this point , only DjangoORM is fun , and even raw SQL is a lot more fun than SQLAlchemy .. I had looked into docs and it seems that not much is improved. API still have many ways to do a single thing , returning a bit different results .

I don't understand why Python Community embrace SQLAlchemy and touted it as best ORM in the world , while it breaks all the Zen of python. Even plain old SQL statements are closer to Philisophy of python.

pugsql looks a lot of fun but when i checked the last commit , it was almost a year ago..


Yeah, no. A programming library using half my viewport for an empty banner with a logo and a menu shows they are not to be taken seriously. Like at all.


This is neat - I'd never heard of it.

I've been using psycopg2 directly with a thin layer for managing connections and transactions using thread-locals for a long time - RealDictRow and some adapters for jsonb and a few other types make it really easy to work with.


its dead though.


Most projects stop updating the previous major release.

https://www.psycopg.org/psycopg3/


Congratulations!

Given how popular SQLAlchemy is, I wonder how many projects are going to wake up on Monday with failing tests (or production) due to not pinning versions and ignoring deprecation warnings...


Can someone share a summary as to what the caching strategy is and how it contributed to such gains?


Fantastic, been waiting for this!


just want to thank the developers and community for their amazing effort. i just started webdev in python (which i did not previously used) and sqlalchemy's docs are amazing. again, thank you!


Congratulations!


Wall of text.

Less is more.

Non-blocking I/O is wonderful news though, especially for high throughput yet high latency internet native databases such CockroachDB.


"internet native" is such a BS term. How has the rest of the internet survived this long, without their "native" tool...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: