Forum Moderators: open

Message Too Old, No Replies

Did InnoDB cause a crash and loss of tables?

         

csdude55

6:34 pm on May 17, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've been using MySQL for about 10 years. In December I changed over to MariaDB; I'm not sure if it's any better, but reviews claimed it to be faster and the code is identical.

All of my tables are MyISAM, but earlier this year I created one table in InnoDB called "error_log". I set my site to write PHP errors to this table instead of a text file, specifically so that I can avoid duplicates and they'll be easier to find. I chose InnoDB because it has a LOT more inserts than reads.

The table has 4 columns:

errno
errstr
errfile
errline

and there's a Primary index on errno, errfile, and errline.

I got busy with other stuff and haven't looked at it in awhile, so there's a possibility that it grew quite a bit in the last few months.

On Saturday morning, "something" happened that made my server completely unresponsive. It took several hours, but my hosting company was able to "fix" it, and they said that they fixed it by adding this to my.cnf:

innodb_force_recovery=4

I'll paste the entire my.cnf file below. They said that the entire virtual disk image had been corrupted, and I think they fixed the corruption using e2fsck.

Everything seemed fine for about 4 hours, and then without warning I had 82 tables just disappear! No explanation, they were just totally gone. But I noticed that the tables that disappeared appeared after "error_log" alphabetically.

That feels like a pretty big coincidence to me... somehow an InnoDB setting was causing my whole server to stop responding, and then all of the tables after the only InnoDB table disappeared?

I should also mention that, at this point, "error_log" only had about 5,000 rows. But it COULD have had more before the crash and deleted rows, I have no idea.

I was able to (partially) restore data (from a backup in December), and then found that "error_log" (the InnoDB table) couldn't be modified; it kept throwing an error that it was read-only. I ended up manually deleting the table via SSH, then recreating a new one that was MyISAM.

What do you think, is InnoDB the source of my problems? I was planning to use it on 9 tables that are mostly just logs (high inserts, low reads), but now I'm very hesitant :-O

csdude55

6:36 pm on May 17, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sorry, I got a Forbidden error when trying to submit with my my.cnf file. I'll try to post it here.


[mysqld]
performance-schema=0
innodb_force_recovery=4

# innodb_buffer_pool_size = 6G
innodb_file_per_table=1

join_buffer_size = 2M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

symbolic-links=0

# pid-file=/var/run/mysqld/mysqld.pid
max_allowed_packet=268435456

low_priority_updates=1

concurrent_insert=ALWAYS

## Table Cache
table_open_cache=2k
table_definition_cache=8k
open_files_limit=10000

## Query Cache
query_cache_size=128M
query_cache_limit=2M

## Key Buffer
key_buffer_size=2G

sql-mode=''

## Errors
slow_query_log=1
#log-queries-not-using-indexes
long_query_time=1 # 1 second, was 5
slow_query_log_file=/var/log/mysqld.slow.log

log-error=/var/log/mysqld.log

lammert

7:33 pm on May 17, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



innodb_force_recovery tells the server to start and ignore almost all errors that appear in the database. The normal step after this is to dump the contents of the database sorted by their primary key, move the original database files to another location and recreate the InnoDB tables from scratch. The MariaDB knowledgebase [mariadb.com] has some hints how to proceed in this situation.

A recovery value of 4 or higher will stop MariaDB from doing any updates to the InnoDB data. So practically these tables are now read-only and can only be dumped and rebuilt. The damage to the database is probably so severe that even a single update prevents the database server from starting.

Reading your story, I think the corruption started at the file system level, corrupting parts in the InnoDB database, probably an index. Then MariaDB panicked, stopped and refused to start again. Corruption in a database file normally doesn't corrupt the file system.

csdude55

9:03 pm on May 17, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Looking through the log file /var/log/mysqld.log, I'm finding some interesting data.

First, I occasionally see groups of 30-40 of these in a row:

2021-05-15 3:43:10 5368803 [Warning] Sort aborted, host: localhost, user: [MY_USERNAME], thread: 5368803, query: SELECT username, image, title, city, gender, age, wanting_gender, wanting_age_min, wanting_age_max, description FROM personals WHERE gender = 'Man' AND wanting_gender = 'Man' AND age >= 18/**//**/OR/**/(SELECT/**/2*(IF((SELECT/**/*/**/FROM/**/(SELECT/**/CONCAT(0x6b767261,(SELECT/**/(ELT(2836=2836,1))),0x7443576e))s),/**/8446744073709551610,/**/8446744073709551610))) AND age <= 64 ORDER BY updated DESC


I WOULD have a legit query of:

SELECT username, image, title, city, gender, age, wanting_gender, wanting_age_min, wanting_age_max, description FROM personals WHERE gender = 'Man' AND wanting_gender = 'Man' AND age >= 18 AND age <= 64 ORDER BY updated DESC


Is this an indication that a hack is successfully modifying the query? The last grouping I see of that query happened about 6 hours before the crash, so I'm not sure if it's related.

The initial crash happened at around 9:30am, and I don't see anything relevant in the logs at that time. After the host was able to get the server back online, though, I have a ton of these (literally thousands):

2021-05-15 17:09:16 0 [Warning] InnoDB: Ignoring a doublewrite copy of page [page id: space=0, page number=737] with future log sequence number 690626617420
2021-05-15 17:09:16 0 [ERROR] InnoDB: Page [page id: space=0, page number=7] log sequence number 690626577502 is in the future! Current system log sequence number 690626034159.
2021-05-15 17:09:16 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.


That backs up your theory that the database corruption came after the file system problem, and not the other way around.

Based on that, my take away is that I should remove innodb_force_recovery from my.cnf, then it's safe to use InnoDB again?

Would you suggest any other my.cnf settings that I should use to prevent a crash like this in the future?

lammert

7:56 am on May 18, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What you see in the file is probably not the result of a hacker but a messed up database. Removing the innodb_force_recovery will probably prohibit the server from starting again. I would try to instead dump the content of the database to a text file and rebuild the tables from scratch. If it is just your error_log table which is in the database, you might even opt to just create a new empty table with the same fields and indexes. You'll probably survive without the error history of the previous months.

csdude55

5:18 pm on May 18, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would try to instead dump the content of the database to a text file and rebuild the tables from scratch.


Hol' up, maybe I've screwed something up, then.

First, I copied all of the files in /var/lib/mysql/example/ to a backup directory so that I would have a safety net. Then I ran a backup through cPanel which created 3 files:

example.create
example.sql
openfileslimit

Then I totally deleted the database "example", then recreated it with the same name and restored the database via cPanel. At that point I found that at least 2 tables could be read but not inserted, so I ran REPAIR on all of the MyISAM tables. That's when I found that "error_log" was showing as read-only, so I went to /var/lib/mysql/example and manually deleted it, then recreated the table as MyISAM.

Since then, I haven't had any errors in /var/log/mysqld.log, so I thought we were all good. But I don't currently have any InnoDB tables. I was thinking about converted several tables that are just logs to InnoDB, but haven't done it yet.

Knowing that, do you still think that removing innodb_force_recovery would crash the server again?

JorgeV

9:30 pm on May 18, 2021 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Hello,

I chose InnoDB because it has a LOT more inserts than reads.

Yes, but, these are inserts at the end of the table, without key constraints, so I think that MyISAM tables are still faster for this kind of insertions.

lammert

10:07 pm on May 18, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you don't have any InnoDB tables, you can safely remove the innodb_force_recovery setting.

csdude55

12:31 am on May 19, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@JorgeV, I thought that it was "better" to use InnoDB when there are a lot more inserts than reads. If not, when would be the appropriate time?

@lammert, I was originally hoping to change several tables to InnoDB, but if it's not faster / better then I'm not sure anymore :-/

csdude55

12:34 am on May 19, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Also, I just discovered that my hosting clients using Wordpress are (shocker!) all using InnoDB, and all of their tables are now read-only :-/ I have no idea if any data has been lost like it was on my site, though.

So now we're talking hundreds of tables. I can't realistically rebuild each of them as MyISAM manually, that would take forever!

Any suggestions on how to proceed?

lammert

11:13 am on May 19, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



First step to proceed is to inform your clients. They will find out sooner or later that their site is read-only or maybe even corrupted. Better to be pro-active on that.

Second step is assessing the real damage to the databases. The last few days were filled with crash recovery, table rebuilding but I miss an assessment of the extent of the damage. Most problems might have been solved, or there might be some real problems still in the dark corners of the InnoDB tables. The only way to know is let a real database administrator look at it and let them fix what is fixable, and delete what is beyond repair.

Third step, consider your future options. You will not be the first company that ceases operation after a central database crash without recent backup.

csdude55

5:55 pm on May 19, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



First step to proceed is to inform your clients. They will find out sooner or later that their site is read-only or maybe even corrupted. Better to be pro-active on that.

Yup, did that this morning.

Second step is assessing the real damage to the databases. The last few days were filled with crash recovery, table rebuilding but I miss an assessment of the extent of the damage. Most problems might have been solved, or there might be some real problems still in the dark corners of the InnoDB tables. The only way to know is let a real database administrator look at it and let them fix what is fixable, and delete what is beyond repair.

Well, that's asking for a few different miracles :-( My server provider is unable (or unwilling, it's not clear) to help, and I can barely afford to pay the bills, much less hire someone. Assuming I could even find a qualified person, which is a whole 'nother problem.

So like it or not, I'm the closest thing to a database administrator that's gonna be able to look at it.

Third step, consider your future options. You will not be the first company that ceases operation after a central database crash without recent backup.

Not really an option... I've come close a few times in the last 4 years after ad revenue dropped so hard, but the simple fact is that there are no jobs available for someone like me that pays what I need to cover my personal bills. Believe me, I've looked! So "ceasing operations" for me translates to losing the house that I've been paying on for 20 years, etc. This isn't a hobby for me and there are no safety nets over here; I work or I don't eat...

One way or the other, my options are to either fix the problem and recover, or lose everything and start begging on the street corners.


For the sake of future readers, I've discovered a LOT of websites that have had the same problem with InnoDB. Example:

[helionet.org...]

I wish I'd read that about 6 months ago, and I would have never even tried to use InnoDB. I never noticed any speed improvements or anything, so the risk just isn't worth it.

lammert

6:48 pm on May 19, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For every problem you encounter, there will be tens, hundreds maybe thousands of people on the Internet who experienced the same. But reading those stories doesn't help you with the current situation. Better to get your act together, fix the server or otherwise others will decide about step three.

Once everything is running again, make a plan how to prevent this from happening ever again in the future. Don't see it as some bad external thing which happened to you. When handled right it may be a good learning experience after all. If you don't have the money to invest in external help, it might cause some sleep less nights.

Just know that MyISAM and InnoDB are two entirely different beasts. The first one is a simple datafile/indexfile based storage system where a crash will either corrupt some data or the index. Create a new index and you are good to go in most cases. It has table based locking which is efficient for systems with a few nodes doing inserts and updates.

InnoDB on the other hand is an ACID compliant storage engine supporting transactions, rollbacks, foreign keys, RAM buffering, error recovery and multi-master support. Row based locking requires much more sophisticated locking control software, but it will allow more nodes to do simultaneous inserts and updates without performance degrades. Foreign key constraints can trigger an avalanche of database actions with just one simple query. But with this complex architecture, if a database file gets corrupted, the damage is often much bigger than with MyISAM.

If you plan to stay with InnoDB, be sure you use the option innodb_file_per_table=ON in your settings file. This will split all your InnoDB tables over different files. If one file get's corrupted, you only lose one table.

For quick disaster recovery a backup is vital. You mentioned that downloading it to your local system is not an option due to bandwidth problems, but cloud storage might be an option. Google, Microsoft or Amazon could be an option, or one of the more specialized cloud backup providers. Try to stay away from backup solutions offered by your hosting company. If they have a technical or operational problem, you might have problems to access both your site and your backups. A few automation scripts should do the trick. Dump the database, pack it together with the necessary files in a .tar or .zip file and transfer it to the remote location each night. The peace of mind you get is well worth the few dollars such a solution will cost.

csdude55

7:56 pm on May 19, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm kind of straying a little, but I had a thought process of setting up a small shared server somewhere, then modify all of my scripts to insert to both the main database AND a second one on that shared server. So not a dump, per se, just doubling down on the inserts.

Then, no matter what, I would have a copy of the database up to the last successful insert instead of the last backup.

I could still run backups for the hosted sites, of course, but none of them update more than once a week (or month, really) so they don't need the same up-to-the-minute backup like I could use.

lammert

8:57 pm on May 19, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A backup can be used for all disasters. Mirroring/high availability only works as long as the underlying system is not corrupted. Doing double database updates in all scripts will also slow down the sites because the slowest database connection dictates the execution speed. And your clients' WordPress installations will probably not be able to do it, unless you hack around in the internals of WordPress.

Therefore, if you only have the resources to go for one, go for backups.

Backups once a month is not an option, unless you enjoy stressful periods as you are going through now. Once a week may be sufficient for semi static sites but once a day is almost mandatory for sites with often changing or user generated content.