Aside

 

new-new-2

Databases are the spine of the tech industry: unsung, invisible, but critical–and beyond disastrous when they break or are deformed. This makes database people cautious. For years, only the Big Three–Oracle, IBM’s DB2, and (maybe) SQL Server–were serious options. Then the open-source alternatives–MySQL, PostgreSQL–became viable. …And then, over the last five years, things got interesting.

Some history: around the turn of this millennium, more and more people begin to recognize that formal, structured, normalized relational databases, interrogated by variants of SQL, often hindered rather than helped development. Over the following decade, a plethora of new databases bloomed, especially within Google, which had a particular need for web-scale datastore solutions: hence BigTable, Megastore and Spanner.

Meanwhile, Apache brought us Cassandra, HBase, and CouchDB; Clustrix offered a plug-and-play scalable MySQL replacement; Redis became a fundamental component of many Rails (and other) apps; and, especially, MongoDB became extremely popular among startups, despite vociferous criticism — in particular, of its write lock which prevented concurrent write operations across entire databases. This will apparently soon be much relaxed, after which there will presumably be much rejoicing. (For context: I’m a developer, and have done some work with MongoDB, and I’m not a fan.)

As interesting as these new developments–called “NoSQL databases”–were, though, only bleeding-edge startups and a tiny handful of other dreamers were really taking themseriously. Databases are beyond mission-critical, after all. If your database is deformed, you’re in real trouble. If your database doesn’t guarantee the integrity of its data and your transactions–i.e. if it doesn’t substantially support what are known as “ACID transactions“–then real database engineers don’t take it seriously:

MongoDB is not ACID compliant. Neither is Cassandra. Neither is Riak. Neither is Redis. Etc etc etc. In fact, it was sometimes claimed that NoSQL databases were fundamentally incompatible with ACID compliance. This isn’t true — Google’s Megastore is basically ACID compliant, and their Spanner is even better — but you can’t use Megastore outside of Google unless you’re willing to build your entire application on their idiosyncratic App Engine platform.

Which is why I was so intrigued a couple of years ago when I stumbled across a booth at TechCrunch Disrupt whose slogan was “NoSQL, YesACID.” It was hosted by a company named FoundationDB, who have performed the remarkable achievement of building anACID-compliant1 key-value datastore while also providing a standard SQL access layer on top of that. Earlier this week they announced the release of FoundationDB 3.0, a remarkable twenty-five times faster than their previous version, thanks to what co-founder and COO compares to a “heart and lungs transplant” for their engine. This new engine scales up to a whopping 14.4 million writes per second.

That is a quite a feat of engineering. To quote their blog post, this isn’t just 14 million writes per second, it’s 14 million “in a fully-ordered, fully-transactional database with 100% multi-key cross-node transactions […] in the public cloud […] Said another way, FoundationDB can do 3.6 million database writes per penny.”

Impressive stuff. Impressive enough to capture the attention of enterprise database engineers, maybe. And obviously a great fit with the forthcoming Internet of Things, and the enormous amount of data that billions of connected devices will soon be constantly capturing.

But most importantly, this will push their competitors to do even better — which, in turn, will hopefully nudge the enormous numbers of enterprises still in the database Bronze Ages, running off Oracle and DB2, to consider maybe, just maybe, beginning to slowly, cautiously, carefully move into the bold new present day, in which developers are spoiled with simple key-value semantics, the full power of classic SQL queries, and distributed ACID transactions, all at the same time. In the long run that will make life better. In the interim, hats off to all the unsung database engineers out there pushing the collective envelope. You may not realize it, but they’re doing us all a huge service.

Source: techcrunch.com

 

Advertisements
Aside

ThinkBigaseIn the world of NoSQL databases, the products that have dominated the conversation are MongoDB and DataStax Enterprise, a leading distribution of Apache Cassandra. But a couple of headlines this week bring into focus a perhaps less-splashy, though rather tenacious player: Apache HBase, which is included with most major Hadoop distributions.

Mongo challenges
The important stories? The seven-year old MongoDB named its third CEO, and HBase-focused startup Splice Machine received $3M in new funding.  There’s nothing in either of these developments, on their own, or even in combination, that proves HBase is gaining ground on MongoDB. After all, outgoing MongoDB CEO Max Schireson attributes his stepping down to the personal toll of travel between the company’s dual headquarters in Palo Alto and New York, and other demands of the job.

But the occurrence of these two news items in the same week, at the very least, provides food for thought around the NoSQL scene.

MongoDB’s fast growth has seemingly introduced growing pains, not only managerially, but also perhaps technologically. I’m hearing more often from developer and industry friends – anecdotally, to be sure – that Mongo has been letting them down in situations of large scale, be it in cluster size or data ingestion volumes.

False dichotomy?
When the other shoe drops in those conversations, it’s DataStax and Cassandra that are usually presented as the counterpoint. This tends to leave HBase out of the conversation.

But HBase’s momentum is growing, and that has little to do with any growth issues over at MongoDB.  While HBase may not have a corporate champion behind it the way Mongo and Cassandra do, it has a lot going for it:

  • HBase, as part of Hadoop, has incumbent status. Its tables are Hadoop Distributed File System (HDFS) files, which means it can process data from, or output data to, other Hadoop workloads, or it can work on its own.
  • Apache Hive can be used to query data in HBase, providing a SQL interface to the NoSQL database
  • MapR has long been promoting the use of HBase for operational applications. The company’s customized read/write version of HDFS helps there, and a C++ based, HBase-compatible database in the company’s M7 Hadoop distribution is especially designed for operational workloads
  • Continuuity’s Reactor product provides a developer platform designed around the combination of Hadoop and HBase
  • Apache Knox, Hortonworks XA Secure and Zettaset Orchestrator all provide security services around HBase data
  • Microsoft (the company behind leading relational database SQL Server) is now offering cloud-based clusters, specially configured for HBase, as a preview in its Azure HDInsight cloud Hadoop service.  In this implementation HBase works atop Azure blob storage
  • As mentioned above, Splice Machine has successfully raised new funding for its HBase offering which, interestingly, is a relational database. This demonstrates, at least to a point, the versatility of HBase as scale-out database infrastructure, that need not limit its use to NoSQL applications

Enough to go around
The interesting thing about HBase, made especially clear by the Microsoft and Splice Machine developments, is that it’s a NoSQL database that augments other data technologies well. HBase’s success isn’t about zero-sum competition and displacement, and it’s not about any one company’s industry prowess.

HBase’s success looks to be about utility and standards. It’s also about HBase’s versatility to work as a standalone database that is nonetheless compatible with other Hadoop technologies and the growing interest in the “data lake” architecture. Keep an eye out for HBase’s continued momentum.

Source: Giagom.com

Aside

As the need for new computing infrastructure tools accelerates, thanks to the demands of distributed computing and the proliferation of mobile devices, startups focused on new data management and data processing technologies are raising big rounds to propel the next generation of computing.

Couchbase, which has raised $60 million in new financing from new investors WestSummit and Accel Growth Fund, is the latest of these big database companies to build out its war chest as it looks to expand internationally and continue its research and development activities.

The Mountain View, Calif.-based company joins MongoDB as one of the best capitalized startups working on operational data management.

Unlike hadoop-based vendors like Cloudera, which has raised over $1 billion in financing, to develop better performing data processing databases, Couchbase and MongoDB are tackling the problem of data management and recovery, says Couchbase chief executive Bob Wiederhold.

“You have huge numbers of simultaneous users of your applications so you’ve got huge numbers of database operations that a database needs to support,” Wiederhold says. Older database technologies are centralized and ultimately lack the same ability to scale up to meet new demand as the distributed databases like Couchbase, he says.

The new capital from WestSummit and Accel Growth will be used to expand the company’s reach in the “big data” market that now accounts for $16.1 billion in global sales, according to the analyst group IDC.

One area where Couchbase will be focusing is on the rollout of its mobile technologies, which launched in May.

The mobile database that Couchbase is now selling into the market, basically enables applications on a mobile device to work even when they’re not connected to the internet.

“Today when you run a mobile app, you’ve been frustrated by a case when you don’t have an internet connection or the internet connection is bad. There’s some cacheing that takes place, but you can’t use the functionality of the app,” Wiederhold says. “If you could store your data on your mobile device, you could get very fast response times, and then whenever you did have an internet connection you could sync your data to the cloud.”

That’s basically the functionality that the Couchbase mobile product provides. “The move to mobile is an enormous mega-trend and people are going to be developing mobile appications first… We think this mobile database is going to be one of htose
key technologies in order to build those great applications.”

Looking beyond mobile, Couchbase’s chief executive sees additional opportunities for international expansion — one reason for the addition of WestSummit, a Sino-US investment fund. Raymond Yang, a co-founder and managing partner at WestSummit, will take a seat on the Couchbase board.

Couchbase has its roots in two separate database companies, CouchOne and Membase, which merged in 2011.

The company has raised over $100 million to date from investors including Accel Partners, Mayfield Fund, North Bridge Venture Partners, Ignition Partners, and Adams Street Partners.

Source: Techcrunch.com