An old company I worked for used project management software with a check-in/out mechanism for making changes. When you "check out" a project it downloads a copy that you change locally, then "check in" uploads it back to the server. A project is "locked" while in the "checked out" state. We all felt it was an archaic mechanism in a word of live updating apps.
After 10 years of building SPA "web apps", that data synchronization mechanism feels ahead of its time.
What many people either can't or don't want to acknowledge is that ultimately whether or not you support live updates in parallel by multiple users, instead of locking so only one update can proceed at a time, is not a technical decision, it's a business decision: do the business rules that are appropriate for your application enable you to deal with concurrent live updates or not?
Ultimately that comes down to whether you can implement a process to ensure consistent resolution of any incompatibilities between multiple concurrent updates. Sometimes that can be done, and sometimes it can't, and which is the case for your application depends on your business rules, not on any technical capability.
If your business rules don't allow you to implement a resolution mechanism, you need locking so that only one update can happen at a time, whether you have the technical capability to support concurrent updates or not.
This is one of those phrases that should turn into a saying, and be passed around for hundreds of years.
Every hard problem I have today in my career involves getting business people to define their business problem properly in order to solve it with technology. Even the hardest code I've ever written was easy compared to some projects, simply due to the business issues lurking around the project. Last week I finished a script to download a CSV and save it to a SQL table (literally) that took 3 weeks because business folks couldn't get their act together on what they wanted. I finished another project in a few days which is currently the core of a previous employers energy efficiency controls product which was easy because the person defining it did it very well, and I had no questions, just work to perform.
Nah. I've seen plenty of systems where the business rules would handle concurrent updates fine, but since they're using a traditional Web/ORM/RDBMS setup they build a last-write-wins system without thinking about it. It's one of those rare problems where the technical part is actually harder than the business part.
Database systems have been able to deal with concurrent updates for quite some time now, so I don't think doing this is technically difficult with the current state of the art. Individual dev teams might not be well versed in the current state of the art, but the correct business response to that is not to restrict your business rules but to get developers who are well versed in the current state of the art.
> Database systems have been able to deal with concurrent updates for quite some time now, so I don't think doing this is technically difficult with the current state of the art.
Traditional ACID systems can't really handle them nicely - your only choice with an update is to commit it or discard it - so you have to do a lot of handwritten logic on top, and even if the database itself handles that well, the layers above it generally don't. Event-sourcing style systems work well but they're still not really mainstream yet.
It solves so many problems and makes it so easy to implement if you go this way.
But just like mentioned it is hard to convince people that it is what they actually want.
People fall into some grand illusion that everything should be always available but in reality then one person is doing changes at a time and if somehow 2 or more people have to work on something - more often than not they should be talking or communicating with each other anyway to synchronize.
Even with GIT and fully distributed development you cannot solve conflicts automagically. You still have to communicate with others and understand context to pick correct changes.
Unison has a neat approach to this problem: References are hashes of the abstract syntax tree, the only way to write a "collision" is to write an identical function--which isn't actually a collision at all.
Good point. I do the same in my own system and use Hashes of the source code, so there are no collisions. Slowly this technique will become mainstream I predict.
This totally. This is one of the reasons that classical RDBMS paradigms and software like MySQL still survive despite however people want to talk it down in favor of "Nosql" or non-relational databases like mongodb citing how fast it is or how cool it is in comparison.
For some things, you need the time tested solutions.
Sounds like RCS [1]. I remember, back when a company I worked for switched from RCS to CVS, one of my coworkers was annoyed that CVS didn't support locking checkouts.
Now I feel old, I remember "Anything but sourcesafe" [0], which was a followup to "Visual Sourcesafe Version Control tunsafe at any speed", and having my trust evapourate when I found out Microsoft didn't dogfood their own version control system.
To get back on topic, the key thing an explicit database gives you is a purpose built-language (and data-integrity enforcement etc. if you do it properly), that everyone knows. (Or used to? SQL is getting more hidden by abstraction layers/eco-systems these days). I'm old, so I reach for my older, well understood tools over new and exciting. Get off my lawn. It may be over-architecting, but I'm also not working in maximising 'performance in milli/micro-seconds is vital' high load environments, or releasing updated software every other day.
The other issue is tool/eco-system fragmentation.
But when you're young and have the energy and mental capacity to abstract out the wahoo for effeciency/performance, you do, because you can, because its better at the time. In our day everyone was writing code to write to code which were effectively the pre-cursors to ORM's. It's just part of being young and committed to your craft, and wanting to get better at it - this is a good thing!
It's only as you get older you start to appreciate the "Less is More" around same time that job ads appear with "Must have 3 years of SQL-Sync experience" (no offence intended here). There are both costs and benefits but which and how much of each you only find out years later.
Are you sure? My experience of using TFVC was that it would warn you if someone else had opened the file for editing but would not actually lock it. Multiple people could edit the same file concurrently with standard automerging/conflict resolution afterwards.
Server workspaces vs local workspaces, maybe? With server, your local copy was marked read-only. Don’t recall if you could change that flag to edit anyway. We moved to local workspaces as
Quickly as we could - that was a more typical offline edit, resolve conflicts at commit model. Don’t remember all the details, been 5+ years since I did anything with TFS.
Yes, “tf edit” would mark on the server that you were opening the file for editing, and cleared the read-only bit, but it didn’t lock the file for others or prevent concurrent edits in any way.
Back in the early days of TFS I was briefly at a company that went all in on MS tools. TFS was used and to avoid the lock each developer had a clone made and after checking their clone in the “TFS Guy” in the office would merge it. He also had to merge things when later checking had conflicting changes.
Now, the best part of this shit show was they had ~30 different customers and each of these customers had a clone of the main thing that would be customized. So the “TFS Guy” had to determine if to keep in the customer clone only or to propagate to the main and then to all the other clones!
Needless to say the “TFS Guy” made a lot of money.
I'm a fan of this approach. SQLSync effectively is doing this continuously - however it would be possible to coordinate it explicitly, thus enabling that kind of check in/out approach. As for single-owner lock strategies, I think you could also simulate that with SQLSync - although you may not need to depending on the app. If the goal is to "work offline" and then merge when you're ready, SQLSync provides this pattern out of the box. If the goal is only one client can make any changes, then some kind of central lock pattern will need to be used (which you could potentially coordinate via SQLSync).
Looks very similar to JEDI [0], an early Delphi VCS system that worked that way. It gave us the tranquility to know that no conflict would appear, as only one developer could work with a locked/checked out file at a time. There was no merge those days. In contrast, files that were frequently changed in every task would always cause a blocking between developers.
Did you know that Postgres has a max table size of 32TB? Its really really fun to find that out the Wednesday evening before Thanksgiving.
Make sure to prune old data from your tables. This one got to this limit because it eventually got too large that queries to delete old data would time out... so it just kept growing.
For 90%+ of our customers (small-to-mid sized US financial institutions), production is the only environment available to work with.
For the other 10%, we take them aside and politely explain that they almost certainly have an unusable staging environment per the scope of our B2B project.
Testing in production is a wonderful path if you are comfortable talking to business people and making lots of compromises.
I don't know if this is normal or if anyone else does it but I usually either download binaries or compile from source and move the executable to /usr/local/bin/ and create a symlink. Lets me easily switch between versions. I avoid using a package manager for anything where I want control over the version and installation.
We have a 10TB database we switched from Aurora to Postgres and it cut out bill by 80%. However, there are some differences in our schema such as now using native partitions so it's hard to tell how much $ is due to the switch and how much due to our table and query design.
Curious what you mean by switching from Aurora to Postgres? AWS offers Postgres on Aurora, and Postgres on regular RDS. Do you mean you switched to RDS, or off of AWS altogether, or something else?
Probably means Aurora MySQL. In CloudFormation and other AWS artifacts, "Aurora" is a keyword that regularly comes up meaning MySQL, since that was the original target for Aurora years before the Postgres flavor was released. There are AWS old-timers at my shop that call it Aurora, and it shows up in their YAML.
To whomever downvoted, when specifying the AWS::RDS::DBCluster "Engine" property in CloudFormation, aurora = mysql5.6 and below, aurora-mysql = mysql5.7 or mysql8.x, aurora-postgresql = postgres. Since 5.6 was deprecated, the "aurora" engine type was removed the CF docs, but it was there until a few years ago. "Aurora" was synonymous with MySQL for a while.
People downvoted because you were assuming that when someone says "we moved to postgres" they would mean "we moved to mysql" as if they wouldn't know what they were talking about.
Even your history thing makes no sense, Aurora Postgres was launched 9 months after Mysql version in July 2015.
So now tell me, between 7-9 years have gone by since Aurora is a multi database product, what sense does it make to assume its mysql? That's like saying at your company people call AWS "S3" or "SQS" because that's how it started. I don't even know what point you're trying to make because unless the majority of the time was mysql only and then recently other databases would've been added, the anecdote would still not make sense.
> People downvoted because you were assuming that when someone says "we moved to postgres" they would mean "we moved to mysql" as if they wouldn't know what they were talking about.
What? That's not what that comment says at all. They're saying that aurora mysql was a plausible interpretation of what OP moved from, before OP clarified.
FWIW, I'm actively exploring native partitions on Aurora with Postgres and I'm seeing very little benefit. Two identical tables, each with 500M+ rows, and I'm not seeing any meaningful performance/IO changes. A composite index with the partition key as the first entry has been as effective for reducing IO and query time as partition pruning. I'm sure there are workloads where partitioning makes more sense, but I've been surprised by how little difference there was in this case.
That's only true if you have low traffic in which case why not host from a $50/mo (at most) VPC? If a business can pay your salary then surely they can afford an extra $50/mo cloud costs.
>you don't have to manage servers
However now you have to learn to write serverless functions that will execute in an environment fundamentally different from your local machine making it more difficult to develop. So you've reduced time spent in devops and increased time spent in development.
Regarding cost, it can depend a lot on your traffic structure. If traffic is bursty, or substantially different between intraday peaks and troughs, it can be more cost effective. Solving this yourself costs dev time.
>reduced time spent in devops and increased time spent in development
This may be true if you’re trying to figure out how to do something you already know how to do outside of serverless, but IME many developers benefit from serverless eliminating boilerplate and nudging them away from state where it’s not necessary.
Also regarding cost and devops: I see serverless as an insurance policy against “my startup/product got a once-in-a-lifetime lucky break by going viral while I was asleep” causing you to fall over from a flood of traffic. Not only do you get to skip implementing your own scaling to go from 0-1 but you only pay substantially (vs the regular price diff) extra for it when it happens.
I've found some great use cases for serverless. None of those have involved hosting a backend for a website / web application. It's been a useful solution for automating some cloud management tasks though.
I wonder if these are the same systems that "detect fraud" and freeze my bank account requiring manual intervention to fix the 2 times a year I send a random family member less than $2,000
After 10 years of building SPA "web apps", that data synchronization mechanism feels ahead of its time.