Understanding What Database To Use
There are many databases, but we can all agree that each one is only good at one thing, and usually, you must find the best fit.
A Quick Note:
I am not a database expert with supreme experience in the field, but I have tried numerous databases, and when I think about the overall design of the schema, performance, and scalability, I overcomplicated things. Why? Because there are so many choices of databases to choose from, but really you must find the one that serves your purpose.
So, if you are an expert you can correct me, but if you are new to databases and handling data, then let's have fun!
What is a Database:
In my own words, a database is an entity, that stores information to be used for later purposes. You can put anything you want into a database, the sky is the limit, really. A special characteristic of a database though is that you can secure the data and create API (application program interfaces), hence interfaces. Interface here, basically means interfaces, which is like an interface in a programming language. Basically, you make a function, in this case, a request to a URL via http, or whatnot to access the data. Think of it like the old days, when you wanted to call someone, but there was a middle-man, which is like the middle-ware in your software or application. They then fetch the data and connect you to the person you want to call, rather much like getting the data and showing it to you. BUT WAIT, what if the middle-man requires a super secret password or more? Well, you can do that too, depending on your server and back-end. Isn't that great?
Basically, a database is storing information that will be used later.
Why do so Many Databases Exist:
Really, I think the main reason is that other communities or developers of said database goes stale or the community sucks, and they just leave it there to rot. However, we have very limited types of databases. Which I mean in how they organize our data. There are mainly only two types and each kind of have there disadvantage and the others advantage.
These main two types are SQL and NoSQL, but now we also have NewSQL. There is GraphQL, but that's a query language that goes on top of your pre-existing database, kind of like a middle-man, but it adds more functionality to normal queries you would do on a database.
The Most Used Databases:
The most used databases from what I can tell... Well, I will explain to you, that it's probably MySQL. Why? Because many of the most awesome CMS software out there use MySQL or a SQL equivalent, like Joomla, WordPress, Drupal, Magento, etc. Why is that? Because of PHP, as it was really the only language back then available, that had much more information than other languages, lots of that legacy code and software just basically evolved today. However, soon people will start switching, because PHP has some limitations, such as object-oriented programming (OOP), not as modular as other languages, and the syntax is awkward (not a limitation though). How does that link? There are a plethora of tutorials of configuring MySQL with PHP, so assuming everyone who wanted to make something cool back then was using PHP and MySQL.
Here we can see the trend of PHP and MySQL:
However, I feel that developers and engineers will start using better alternatives thanks to discovering new ways of organizing data, such as GraphSQL (if I'm not mistaken, is different from GraphQL).
What Are Some Popular Databases:
Now, some of the hottest databases you may have heard of, as of like 2017 and 2018 are FireStore, Azure Cosmos DB, IBM Db2, Datomic, etc. I'm sure there are more, but those are some good ones that are unique to each other.
Personally, I have tried only one of them (FireStore) but didn't really make any big applications. You can search up each of them, and see what they each specialize in.
FireStore is much like the Real-Time Database in FireBase, however, it has better queries, and I believe some other extra features for creating collections. Azure Cosmos DB is basically a combination of many ways of database logic, like SQL or NoSQL, what it different, is the service Microsoft provides, which make it easier for you to replicate and scale it across the globe because Microsoft has lots of data centers. Then IBM Db2 is basically the same as Cosmos, but it's a different service and you can use machine learning on your databases (probably the same with Microsoft). Then Datomic is part of Amazon marketplace but has its own cloud database service.
You could argue that they strip your freedom, and basically have control of your data. So many people would rather be in control of their system and database, without the prying eyes of other people.
What Database For The Job:
Really you should learn about the databases that are the best of their respected type. This is because you don't know which community or company will have superior development and support. Look what happened to the free open-source MySQL. Its development is kind of lacking compared to other databases like PostgreSQL. Then, we have NoSQL, which I believe either MongoDB or CouchDB or some of the top ones for that type to learn.
There are other articles on the web that go into greater detail for explaining the advantages and disadvantages of SQL vs. NoSQL, and in plain sight, it's not hard to see. Because SQL relies on RDBMS, which is the old but masterful, relational database management schema (or system?), you really can't think of another way to organize data, when they have some similarities. However, if you don't care much about similarities, you can just story different keys, in a stack of documents part of a collection like NoSQL.
One important thing, I know is that you can actually tailor your database design to either SQL or NoSQL. They will be different, but if you do need relationships, you will choose SQL as it is much easier to deal and fetch data from a user, whereas in NoSQL, you would fetch from different collections of the same user, which is more processes (I think). In addition, SQL scales more vertically than horizontally, and NoSQL scales more horizontally than vertically. The reason NoSQL is awesome is that, since it doesn't rely on relationships, the data can be sharded much easier than MySQL. Sharding is distributing your database across multiple machines, and I believe you can extend your database much easier with NoSQL since queries wouldn't rely on relationships either.
Here is a good explanation of disadvantages for not sharding a traditional SQL database:
In theory, you would also most likely want to use NoSQL for things such as data collection, in which you just want to analyze different collections of stored data that do not relate to each other. An example is like weather temperature, someone enters a room or not, the distance of something, or the timing for something. All of those do not require a relationship, and you can analyze them and collect data because there will be so many cases of that particular action. Whereas, in a SQL database, having someone register and post a comment would, in reality, have fewer data to collect (it depends if your application is awesome). So generally, if you collect data that is abundant in nature, go for NoSQL, and if you rely on data as inputs, use a SQL.
Other Ways of Searching and Handling Data:
Yes, there are so many, but a couple of some popular ones are Hadoop, ElasticSearch, Ceph, Disco, etc.
I don't have much knowledge on these, but I will look into them more after I learn a few more things on my list.
I will definitely try and write specific articles on just one database next time, but I really wanted to explore and give you awareness of other databases out there. Thanks and have a fantastic day! :)
Comments powered by CComment