This Monday Jim Fulton, one of the first Python contributors, hosted a webinar about storing JSONB documents in PostgreSQL. Watch it now:
Known mostly for its mature SQL and data-at-scale infrastructure, the PostgreSQL project added a “JSONB” column type in its 9.4 release, then refined it over the next two releases. While using it is straightforward, combining it in hybrid structured/unstructured applications along with other facilities in the database can require skill.
In this webinar, Python and database consultant Jim Fulton shows us how to use JSONB and related machinery for pure and hybrid Python document-oriented applications. We also briefly discuss his long history back to the start of Python, and finish with his unique NewtDB library for native Python objects coupled to JSONB queries.
Jim uses PyCharm Professional during the webinar. PyCharm Professional bundles the database tools from JetBrains DataGrip, our database IDE. However, the webinar itself is focused on the concepts of JSONB.
You can find Jim’s code on GitHub: https://github.com/jimfulton/pycharm-170320
If you have any questions or comments about the webinar, feel free to leave them in the comments below, or you can reach us on Twitter. Jim is on Twitter as well, his Twitter handle is @j1mfulton.
-PyCharm Team
The Drive to Develop
Speaker talks about scalability, proposes
select nextval(tablename)
to get next id safely. I’ve heard enough.What’s the problem with doing that? (I’m really asking for educational purposes)
As far as I know, the database will make sure there are no conflicts in this case, no?
That was pretty educational, great topics covered. I have a question for Jim, though…when he talks about using pq for queuing emails…has he ever tried to do this along with Celery? Just wondering if this can substitute something like RabbitMQ or Redis, and why it should be better. I see some advantages in having a table with queued messages, but you might be shooting yourself in the foot trying to do something better than RabbitMQ?
Thanks for a great talk!
The problem with Celery (and SQS) is that the handoff isn’t
transactional. This is a problem.
I highlighted pq because there is at least a solution that has a
transactional handoff.
But you’re right, managing queues is hard, especially when dealing
with various worker errors, and using something established is
attractive.
There was another solution, zc.async, that provided queues based on
ZODB and it was very complex.
I think this is mostly an unsolved (and sadly unrecognized problem).
A way I solved this in the past for SQS was to use a very simple
transactional database queue whose job was to very temporarily hold
jobs to be submitted to SQS. Jobs were moved from the database-based
queue to SQS after commit and, in the rare cases that that failed, the
jobs were retried later using a clean-up process. The database-bases
queue can be extremely simple because the workers only have one thing
to do and failure is rare and virtually guaranteed to succeed
eventually, generally on the first retry. I could easily imagine pq
being used this way.
It would be useful to have a transactional broker for celery.
Good point, Jim.
I’ll give pq a try, to see how it behaves dealing with asynch queue processing.
Thanks!