So its only been a few weeks since I started letting users upload mp3s again and already have had someone hack the uploader script, post an php script, execute the file, and access the server file system to create symlinks to other user’s files. Eventually I am going to need to put this system into its own VPS so there is nothing to even get too outside of the application. Once I seen this happen (maldet scans) I shut down uploads until I could come up with a few solutions. In the past I have used mime types but this is also something that can be faked. In order to 100% validate that a file is indeed really an mp3, I need to do a much deeper comparison on an uploaded file to determine if it really is an mp3.
Back in the day I used to use an old python project called eyeD3 (http://eyed3.nicfit.net) to automate Id3 Tags into a database row for each upload. Upon researching ID3 Tags in 2016 I found something new: getId3 (http://getid3.sourceforge.net). I am going to use getID3 to determine wether an mp3 has valid ID3 tags and play time. This will be the definitive answer of “is this file upload an mp3 or not”. No playtime, no upload. I will need to make an mp3 “Fixer” app to link to this uploader, as I know that many users will be uploading mp3s that were encoded improperly. I have coded this already so just need to dig it up.
Out of the box getIDd3 folder browse demo worked great to check a folder for mp3s. As you can see, most MP3s have no valid ID3 Tag data but are recognized as valid mp3:
I used it a few times just to see what it does before i started adding my own code. The first thing I did was setup a DELETE when the playtime is 0 (zero). This will definitely delete any file that is not an MP3. I seen an image, a zip file, and a php script uploaded as filename.mp3. Once I had this tested and working, I implemented getID3 into the upload script, so that the upload does not save, gives the user no url, etc.
ID3 tags don’t lie people, use them.