Forum Moderators: open
I would like to have some tips on what database to use for an application that I'm currently working on.
I'm going to have a table where the data is quite temporary and changes a lot (from one day to another the rows may be totally different). The row will have the following fields: one primary id, three varchar fields of about 200 bytes each (they will be of about 50 bytes each on average), two other integers and three more char fields of about 20 bytes each. The 200 bytes varchar fields should be indexed and searchable, and one of the 20 bytes fields (a hash value) should be an index as well for fast access. Because of the temporary nature of the data in the table, it's not critical if data is lost or cleared in case of a crash.
I've estimated that with about 10k clients there will be about 10 million rows in the table (about 2-3 GB) at any given moment. With this traffic, there will be about 7 inserts and deletes each second with an average of 1000 rows inserted and deleted on each insert/delete. In other words 7000 rows will be inserted and deleted every second. At the same time there will be about 150 selects every second. The
This seems to be a quite heavy load, and therefore I've been thinking of using an in-memory database, since the data is not critical. But using an in-memory database raises a problem with the size of the database since I will need a server with a LOT of internal memory (the database can't be larger than the total amount of physical memory). An in-memory database would probably solve all processing performance problems, but the internal memory will set the limit. I'm not sure but I'm guessing that if using, for example MySQL, the limit will be the processing time of the inserts/deletes.
So I guess what I'm hoping for now, is to get input on what kind of database I could use that can handle this amount of traffic. How would MySQL perform with that kind of traffic? Anyone who has experience of some similar application?
Thanks a lot in advance!
/Jonatan
Since the application is not yet released I would prefer not to explain why so many rows need to be inserted/deleted every second, but I can assure you that it IS needed. And for my questions I really don't see any need to explain it either.
Thanks,
/Jonatan
If I should stress-test it myself I need to create a database table that contains data that is similar to the production case, and then I need to write a program/script that can "stress" the database with a lot of queries. I was hoping that I could avoid this, but maybe this is what I need to do.
Thanks again,
/Jonatan
We have decided that we will probably go with a MySQL database that has a large in-memory (heap) table.
Anyone has an idea on what kind of server would be needed for the kind of load I described earlier using a MySQL in-memory (heap) table?
We are currently looking into purchasing a server that has two dual core AMD Opteron 270 (2 ghz, 64-bit) and 4 gb ram. Does anyone know if this configuration would be sufficient?
txbakers:
Any tip on how to "handle the connection objects correctly"?
Thanks for the help so far!
/Jonatan
[edited by: Jonatan at 3:02 pm (utc) on Mar. 25, 2007]
We are currently looking into purchasing a server that has two dual core AMD Opteron 270 (2 ghz, 64-bit) and 4 gb ram. Does anyone know if this configuration would be sufficient?
That might be an overkill althought not knowing exactly what you're doing makes it tough to answer. I can say that on a setup close to what you have listed I can almost always get at least 100k questions per second in memory and sometimes I run as high as 272K per second.
With memory tables the important factors are how much memory, the memory speed and the cpu speed. So you'd get better performance if you used 1066 memory versus 667 or 533.
JAG
For example, I have a small script that only loads about 10 Gig of data and less than 200 million rows at a time. I then process that data in the db and output my final results to a disk based table. It only takes about a minute or so to do what I need to do so if I have a db go down for whatever reason I only have to reload that last batch and move on.
In other words...the loss of data just means about a 2 minute loss in processing time. Nice tradeoff from disk based tables that would take me days to do the same processing.
JAG