It sounds pretty safe. Your only copying rows from one database into another database. Unless you accept arbitrary SQL strings from the user it's not a significant security risk.
It does open up some possible vulnerabilities like can the user overwrite other people's information but mitigating that requires the same validation layer you should have anyway.
SQLite is explicitly not safe to be used on arbitrary DB files and there’s a nontrivial amount of exploits on it from DOS to heap overflows to remote code execution that stem from untrusted SQL queries or processing untrusted DB files [1].
At a minimum you have to follow [2] but you don’t get to say “it’s safe to open malicious files or process unrelated queries“ and “SQLite has a good security track record because all our CVEs are only from untrusted queries and malicious input files and CVEs are useless anyway“. Those are facially contradictory positions likely written by different team members that reflect their individual perspective rather then there being a well thought or security stance (at least in my opinionated viewpoint).
As another comment pointed out, it's more dealing with carefully crafted db files that trigger issues or exploits, like a zipbomb would for archive processing.
Its safe enough. The user never sees the database they send, and the import routine does an extract and conversion that only uses a few tables. None of it goes near the user auth infrastructure except to check that the sender is who they claim to be.
A user can trivially sniff the traffic, realize you are sending a zipped SQLITE database, and craft a malicious binary file, zip it, and send it to your API. What the user sees or can do using your app is irrelevant from a security perspective.
You are opening an untrusted binary file using SQLite on your backend. This is 100% not safe.
You should convert to JSON or some other serialization before you send it, then your API should only accept JSON. Zipping a SQLite database is not a good serialization method... Accepting and opening an arbitrary sqlite binary file is asking for trouble.
Those problems could be addressed in a pretty easy way, plus if you're security conscious, just run the importer in an entirely separate container which will basically be single use (possibly distroless, if you want to go that far), with resource limits in place.
But that's not my point. My point is that both of the data formats should be pretty much equal and them not being so in practice is just a shortcoming of the software that's used - for example, even spreadsheets ask you before executing any macros inside of them. There definitely should be a default mode of addressing such files for just reading data, without handing over any control of the computer to them.
> Zipping a SQLite database is not a good serialization method...
Therefore, with this i disagree. SQLite might be flawed, but zipping an entire dataset and sending it over the network, to be parsed and merged into a larger one is an effective and simple solution. Especially, given that your app can use SQLite directly, but probably won't be as easy to make while storing the state as a large JSON file, which will incur the penalty of having to do conversion somewhere along the way. Here's why i think it's a good approach: https://sqlite.org/appfileformat.html
Who's to say that JSON/XML/... parsers also wouldn't have CVEs, as well as the application server, or back end stack, or web server that would also be necessary? In summary, i think that software should just be locked down more to accomodate simple workflows.
> My point is that both of the data formats should be pretty much equal and them not being so in practice is just a shortcoming of the software that's used - for example, even spreadsheets ask you before executing any macros inside of them.
Okay, but you need to defend against reality, not against what could in theory be possible.
Sandboxing is a pretty good solution, at least.
> Who's to say that JSON/XML/... parsers also wouldn't have CVEs, as well as the application server, or back end stack, or web server that would also be necessary?
Raw SQLite files are a huge attack surface that isn't directly designed to be secure. JSON is an extremely simple format that can be parsed securely by just about anything (though occasionally different parsers will disagree on the output).
(Edit: Strange - this reply was intended to be attached to the grandparent of this post. Not sure why it ended up here.)
XML, a data format explicitly designed for interchange where parsing untrusted input was a design goal of the language.. contains ‘external entities’, which permits the person crafting an XML doc to induce a vulnerable reader of the document to pull in arbitrary additional resources and treat the data from them as if they came from the document creator.
There are all sorts of confused deputy attacks you can perform via this kind of mechanism.
If XML can have that kind of issue, when it ostensibly contains no arbitrary execution instruction mechanism at all, how can you expect a sqllite database file, which can contain VIEW and TRIGGER definitions, to be safe?
It's been repeated a few times here that SQLite is a big attack surface - might be worth taking that discussion to a new submission rather than continuing to hijack this one:
Your argument can be extended to claim just shipping executable binaries that output the data you want when you execute them should be equivalent as well.
It’s unsafe because the attack surface is so large and the use case of an untrusted attacker isn’t something strongly considered.
> In my eyes, the following should be equal:
In an ideal world maybe, but this hasn’t been true for the last 50 years.
These are good points, I do have some security concerns raised above that need addressing but overall, the profile of this app is tiny and the the chain of ownership is relatively secure so the risks seem pretty low to me.
At least for your first point : Security by obfuscation is not security. Counting on « the user not realizing » that your app has a massive security hole is not security.
For the second point, I would say that SQLite has a massive attack surface, it would be very difficult to ensure that that technique can’t lead to an exploit of some form.
On a rooted phone the local database copy could be fiddled with I guess, but the user needs to be authenticated to upload a database, the lambda that extracts the data is sandboxed to access only what it needs and nothing in sqlite is run, the extractor does a select on a few tables.
Unless there is some way to introduce a malicious side effect to a select statement in sqlite?
> On a rooted phone the local database copy could be fiddled with I guess
If you depend on users (attackers) not being able to modify their software or environment and poke around at each and every bit of your (publicly accessible) interfaces you are doing something awfully wrong!
> but the user needs to be authenticated to upload a database
Is registration for your service limited to a fixed amount of trustworthy people? Otherwise this isn't an obstacle.
> the lambda that extracts the data is sandboxed to access only what it needs
Using a simple serialisation format would be orders of magnitudes safer (and simpler)
> Unless there is some way to introduce a malicious side effect to a select statement in sqlite?
It has nothing to do with fiddling with a phone. The user simply needs to run their traffic through a proxy, observe what kind of requests it's making, and then construct a malicious request from a machine that's well suited to doing so - probably their desktop rather than their phone. They can obtain the auth token either from sniffing the traffic or by extracting it from their phone; the former is easier. You seem to be assuming that the only way of maliciously making a request is by somehow altering the phone which is running the app. That is not how it works.
As for introducing a malicious side effect into the query, that's simple: just add an UPDATE, DELETE, CREATE, or INSERT. When you say that the importer can only run SELECT statements, do you mean that it's only authorised to make SELECT statements, or are you simply assuming that the importer won't be able to mutate any data? Because I suspect it's the latter, and that's not correct. I really truly hope your application is not responsible for anything important.
Can the authenticated user upload a modified database file that identities him as a different user? (For example by changing a uuid or username in the data before it is sent)? That could be done in JSON format, too, so it is not specific to SQLite. Just curious if this is a possibility.
Let’s assume your code does something along the lines of:
Download sqllite file from S3
Mount file as a sqllite database
Execute a statement like SELECT * FROM userData on the mounted database
Connect to an online database and insert the returned data into an importedData table for later validation and integration
I’m assuming you’re running this in an ephemeral lambda like context where it only has the data and permissions needed to accomplish those operations.
What can go wrong here, given the user has control over the sqllite file? How could someone who has observed that your system uploads a zipped sqllite file craft a payload to do something malicious?
Well, that code would run just fine even if userData was not a table - it could be a view. That means the data returned to your query doesn’t have to come from data in the sqllite file they uploaded, but could be being calculated at select time inside your process. Are you sure there aren’t any sqllite functions that a view could use that can read environment data or access the file system? If there are, they could get your code to import that data into their account - data that might easily include S3 access secrets or database credentials.
Are you also sure there’s nothing in a sqllite file that tells sqllite ‘load the data from this arbitrary external file and make that available inside the schema of this database’? Then a view could pull data from that context.
Maybe that would let someone craft a sqllite database that imports the contents of your AWS profile file as if it were one of their data values.
Now, I did take a look at the sqllite sql syntax and I will say I don’t see anything that looks immediately exploitable (no ‘readEnvironment()’ built in function or anything) but that doesn’t mean there’s nothing there (are there any undocumented test features built in to specific implementations of sqllite maybe?). But the question you need to consider is: Mounting fully untrusted db files just might not really be a vector sqllite is built to defend against, in which case that puts the onus on you to be sure that the file is as you expected.
Also, where are you left if in the future a new version adds a feature like that?
ANY mechanism along these lines that lets a sqllite db pull in environment or file system data would make this system exploitable within the bounds of sqllite, even if sqllite contained no ‘vulnerabilities’ like buffer overruns to maliciously crafted files.
And the crazy thing is, these kinds of vectors have shown up in data exchange oriented file formats like XML and YAML, so it’s honestly prudent to assume that in a richer format like sqllite they are almost certainly present until proven otherwise.