I’ve just come back from OracleCode Singapore. It was a great event – the venue was awesome and the attendees were engaged and interested in the content. But there was one thing that I found amusing (disturbing perhaps?) is the number of times I had people approach me on the topic of scaling. Conversation would typically run along the lines of:
“What is your recommendation for scaling?”
which almost suggests that scaling is of itself, the end solution here. Not “Here is function X, and I need it to scale”, or “My business requirement is X, and it needs to scale” but just “I need to scale”
So I’d push back and ask more questions:
- Scale what?
- What data are you capturing?
- Where is it coming from?
- What speed? What volume?
- What are you plans with the data you are capturing?
- How do you intend to process the data?
- Is it transient? Are you planning on storing it forever? Is it sensitive information?
And the scary thing is – more often than not, those were questions for which they did not have answers to (yet?). I’d ask the questions, and very quickly the conversation would be returned to:
“Do I need sharding?”
”Should I use a NoSQL solution?”
“What ‘aaS’ option should I be using to achieve my scaling needs”
”How many nodes do I need?
”What server configuration is best?”
I’m seeing this more and more – that the technological approach to achieve a business requirement is seen AS the business requirement. I hate to be brutal (well…that’s a lie, I like being brutal ) but here’s the thing – Stop being so damn focussed on scaling until you have an idea of what your true performance requirements are!
Don’t get me wrong – there are systems out there that need to be architected from the ground up that will have to deal with scaling challenges that have perhaps never been tackled before. But read those last few words again: “never been tackled before”. Do you know what that also means? It means it applies to an intsy wintsy tiny percentage of IT systems. If it wasn’t, then surprise surprise – those challenges have been tackled before. Why does everyone in the IT industry think that the system they are about to build will need the same architectural treatment as those 0.00001% of systems in the world that truly do.
Because in almost all of my time in IT, for the other 99.999% of systems out there – the two critical solutions to scaling systems to meet (and well and truly exceed) the performance requirements to meet the business needs are pretty simple:
1) don’t write crappy code,
2) don’t store data in a crappy way
That’s it. When you can definitively demonstrate that
a) your code is well written,
b) your data is being stored in a means to best serve business requirements
and your application still cannot meet performance needs, then yes, it’s time to talk about architectural options for scaling. But more and more I see folks ignoring (a) and (b), or worse, just assuming that they are implicit and guaranteed to happen, and leaping straight into “I need a 10,000 node, geo-disperse, NoSQL, cached, compressed, mem-optimized, column-based, non-ACID, distributed blah blah blah” for my system to work.
Here’s a reality check – you don’t. Save yourself a lot of hassles and start simple and focus on quality. You’ll find things will probably scale just fine.
If you’ve made it this far through the post and you think I’m just ranting…well, that’s true but let me also answer the next obvious question:
“So how do we make sure we write good code? How do we make sure we store our data intelligently?”
That’s why we (developer advocates) are here. We’re here to help you succeed. So check out our resources, reach out to us via social media channels, and we’ll help you every step of the journey.
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” (Knuth, Donald. Structured Programming with go to Statements, ACM Journal Computing Surveys, Vol 6, No. 4, Dec. 1974. p.268.)
I.e scale the hardware and _THEN_ bloat the software 🙂
I’d contend its the opposite…in 97% of the cases, it is scaling the hardware that gives you a small efficiency gain, and writing better software is what gives you the massive/important efficiency gains. And better software is rarely bloated.
One simple example of bad “scaling”: huge connection pools, often a sign of crappy code.
I always say test with one connection first and make sure that connection is used wisely.
If you can’t get much done with one connection, there are too many round trips
and / or the Client code is not releasing the connection quickly.
Then test with two connections and look for interference between the two sessions.
Most applications out there should unscale!
I would like to sympathize with some of the people who feel the need to ask those confused and vague questions. Here’s the reason, as you put it: “there are systems out there that need to be architected from the ground up…”
Nobody wants to be in that position. THAT does not constitute a bug; it’s a failure (which can feel like a personal one) and it can be an epic one when you realize the amount of money which has been wasted on a failing system.
Everybody knows that you’re not supposed to write “crappy code” but the developers themselves have to be the judge of that (or be judged by the people in the same circle, which there’s a chance that they have the same level of knowledge) and if they knew what constitutes a piece of “crappy code,” they would have never written it that way, obviously.
It’s the FEAR of ending up in that situation which drive people to start asking questions like that, before thinking about it, and the stigma of being accused of “writing a crappy code” doesn’t help. It’s the anxiety of being called out, while you don’t know what you don’t know.
That is not necessarily a stupid question: It can be a cry for help; because many experts are too eager to criticize practices which lead to scalability issues and less eager to clarify/identify how scaling issues occur, and how to prevent them. You can find many blogs dedicated to analysis on (e.g.) optimizer’s behaviour but do we have concentrated resources about scalability? I don’t think so.
And eventually, Optimizer’s edge-cases might lead to some headaches, but rarely lead to failures which “…need to be [re-]architected from the ground up…”
That fear is real. That anxiety is justified.
If you want to scale, one thing you can do to measure your scaling capability, is to count the number of process context switches that take place to get X (some use-case in your application) done.
Ideally that should be 1.
If it’s orders of magnitude more, you have a scaling issue.