Is that safe? That sounds like intentional SQL injection

vbitz · on Sept 29, 2021

It sounds pretty safe. Your only copying rows from one database into another database. Unless you accept arbitrary SQL strings from the user it's not a significant security risk.

It does open up some possible vulnerabilities like can the user overwrite other people's information but mitigating that requires the same validation layer you should have anyway.

vlovich123 · on Sept 29, 2021

SQLite is explicitly not safe to be used on arbitrary DB files and there’s a nontrivial amount of exploits on it from DOS to heap overflows to remote code execution that stem from untrusted SQL queries or processing untrusted DB files [1].

At a minimum you have to follow [2] but you don’t get to say “it’s safe to open malicious files or process unrelated queries“ and “SQLite has a good security track record because all our CVEs are only from untrusted queries and malicious input files and CVEs are useless anyway“. Those are facially contradictory positions likely written by different team members that reflect their individual perspective rather then there being a well thought or security stance (at least in my opinionated viewpoint).

[1] https://www.sqlite.org/cves.html

[2] https://www.sqlite.org/security.html

smackeyacky · on Sept 29, 2021

I do have a few of these in my importer but thanks for the links, will be implementing a few more of these safety measures.

smackeyacky · on Sept 29, 2021

Exactly. It won't overwrite stuff that isn't yours, everythin is in a silo in the database.

tehbeard · on Sept 29, 2021

As another comment pointed out, it's more dealing with carefully crafted db files that trigger issues or exploits, like a zipbomb would for archive processing.

nine_k · on Sept 29, 2021

Well, create a zip file with all SQLite tables converted to CSV. This can be done in a streaming fashion, using very little extra space.

Hopefully your CSV parser has fever vulnerabilities.

smackeyacky · on Sept 29, 2021

Its safe enough. The user never sees the database they send, and the import routine does an extract and conversion that only uses a few tables. None of it goes near the user auth infrastructure except to check that the sender is who they claim to be.

8192kjshad09- · on Sept 29, 2021

A user can trivially sniff the traffic, realize you are sending a zipped SQLITE database, and craft a malicious binary file, zip it, and send it to your API. What the user sees or can do using your app is irrelevant from a security perspective.

You are opening an untrusted binary file using SQLite on your backend. This is 100% not safe.

You should convert to JSON or some other serialization before you send it, then your API should only accept JSON. Zipping a SQLite database is not a good serialization method... Accepting and opening an arbitrary sqlite binary file is asking for trouble.

KronisLV · on Sept 29, 2021

In my eyes, the following should be equal:

  - a binary file that's used for storing data, like an SQLite database
  - a text file that's used for storing data, like an XML or JSON file

Someone else linked a list of CVEs and how those could be exploited, the look of which isn't that bad: https://news.ycombinator.com/item?id=28690837

Those problems could be addressed in a pretty easy way, plus if you're security conscious, just run the importer in an entirely separate container which will basically be single use (possibly distroless, if you want to go that far), with resource limits in place.

But that's not my point. My point is that both of the data formats should be pretty much equal and them not being so in practice is just a shortcoming of the software that's used - for example, even spreadsheets ask you before executing any macros inside of them. There definitely should be a default mode of addressing such files for just reading data, without handing over any control of the computer to them.

> Zipping a SQLite database is not a good serialization method...

Therefore, with this i disagree. SQLite might be flawed, but zipping an entire dataset and sending it over the network, to be parsed and merged into a larger one is an effective and simple solution. Especially, given that your app can use SQLite directly, but probably won't be as easy to make while storing the state as a large JSON file, which will incur the penalty of having to do conversion somewhere along the way. Here's why i think it's a good approach: https://sqlite.org/appfileformat.html

Who's to say that JSON/XML/... parsers also wouldn't have CVEs, as well as the application server, or back end stack, or web server that would also be necessary? In summary, i think that software should just be locked down more to accomodate simple workflows.

Dylan16807 · on Sept 29, 2021

> My point is that both of the data formats should be pretty much equal and them not being so in practice is just a shortcoming of the software that's used - for example, even spreadsheets ask you before executing any macros inside of them.

Okay, but you need to defend against reality, not against what could in theory be possible.

Sandboxing is a pretty good solution, at least.

> Who's to say that JSON/XML/... parsers also wouldn't have CVEs, as well as the application server, or back end stack, or web server that would also be necessary?

Raw SQLite files are a huge attack surface that isn't directly designed to be secure. JSON is an extremely simple format that can be parsed securely by just about anything (though occasionally different parsers will disagree on the output).

jameshart · on Sept 29, 2021

(Edit: Strange - this reply was intended to be attached to the grandparent of this post. Not sure why it ended up here.)

XML, a data format explicitly designed for interchange where parsing untrusted input was a design goal of the language.. contains ‘external entities’, which permits the person crafting an XML doc to induce a vulnerable reader of the document to pull in arbitrary additional resources and treat the data from them as if they came from the document creator.

There are all sorts of confused deputy attacks you can perform via this kind of mechanism.

If XML can have that kind of issue, when it ostensibly contains no arbitrary execution instruction mechanism at all, how can you expect a sqllite database file, which can contain VIEW and TRIGGER definitions, to be safe?

smackeyacky · on Sept 29, 2021

It's been repeated a few times here that SQLite is a big attack surface - might be worth taking that discussion to a new submission rather than continuing to hijack this one:

https://news.ycombinator.com/item?id=28691759

kortilla · on Sept 29, 2021

Your argument can be extended to claim just shipping executable binaries that output the data you want when you execute them should be equivalent as well.

It’s unsafe because the attack surface is so large and the use case of an untrusted attacker isn’t something strongly considered.

> In my eyes, the following should be equal:

In an ideal world maybe, but this hasn’t been true for the last 50 years.

smackeyacky · on Sept 29, 2021

These are good points, I do have some security concerns raised above that need addressing but overall, the profile of this app is tiny and the the chain of ownership is relatively secure so the risks seem pretty low to me.

crubier · on Sept 29, 2021

At least for your first point : Security by obfuscation is not security. Counting on « the user not realizing » that your app has a massive security hole is not security.

For the second point, I would say that SQLite has a massive attack surface, it would be very difficult to ensure that that technique can’t lead to an exploit of some form.

smackeyacky · on Sept 29, 2021

On a rooted phone the local database copy could be fiddled with I guess, but the user needs to be authenticated to upload a database, the lambda that extracts the data is sandboxed to access only what it needs and nothing in sqlite is run, the extractor does a select on a few tables.

Unless there is some way to introduce a malicious side effect to a select statement in sqlite?

thefr0g · on Sept 29, 2021

> On a rooted phone the local database copy could be fiddled with I guess

If you depend on users (attackers) not being able to modify their software or environment and poke around at each and every bit of your (publicly accessible) interfaces you are doing something awfully wrong!

> but the user needs to be authenticated to upload a database

Is registration for your service limited to a fixed amount of trustworthy people? Otherwise this isn't an obstacle.

> the lambda that extracts the data is sandboxed to access only what it needs

Using a simple serialisation format would be orders of magnitudes safer (and simpler)

> Unless there is some way to introduce a malicious side effect to a select statement in sqlite?

See all the links posted here already

samhw · on Sept 29, 2021

It has nothing to do with fiddling with a phone. The user simply needs to run their traffic through a proxy, observe what kind of requests it's making, and then construct a malicious request from a machine that's well suited to doing so - probably their desktop rather than their phone. They can obtain the auth token either from sniffing the traffic or by extracting it from their phone; the former is easier. You seem to be assuming that the only way of maliciously making a request is by somehow altering the phone which is running the app. That is not how it works.

As for introducing a malicious side effect into the query, that's simple: just add an UPDATE, DELETE, CREATE, or INSERT. When you say that the importer can only run SELECT statements, do you mean that it's only authorised to make SELECT statements, or are you simply assuming that the importer won't be able to mutate any data? Because I suspect it's the latter, and that's not correct. I really truly hope your application is not responsible for anything important.

TedDoesntTalk · on Sept 29, 2021

Can the authenticated user upload a modified database file that identities him as a different user? (For example by changing a uuid or username in the data before it is sent)? That could be done in JSON format, too, so it is not specific to SQLite. Just curious if this is a possibility.

jameshart · on Sept 29, 2021

Let’s assume your code does something along the lines of:

   Download sqllite file from S3
   Mount file as a sqllite database
   Execute a statement like SELECT * FROM userData on the mounted database
   Connect to an online database and insert the returned data into an importedData table for later validation and integration

I’m assuming you’re running this in an ephemeral lambda like context where it only has the data and permissions needed to accomplish those operations.

What can go wrong here, given the user has control over the sqllite file? How could someone who has observed that your system uploads a zipped sqllite file craft a payload to do something malicious?

Well, that code would run just fine even if userData was not a table - it could be a view. That means the data returned to your query doesn’t have to come from data in the sqllite file they uploaded, but could be being calculated at select time inside your process. Are you sure there aren’t any sqllite functions that a view could use that can read environment data or access the file system? If there are, they could get your code to import that data into their account - data that might easily include S3 access secrets or database credentials.

Are you also sure there’s nothing in a sqllite file that tells sqllite ‘load the data from this arbitrary external file and make that available inside the schema of this database’? Then a view could pull data from that context.

Maybe that would let someone craft a sqllite database that imports the contents of your AWS profile file as if it were one of their data values.

Now, I did take a look at the sqllite sql syntax and I will say I don’t see anything that looks immediately exploitable (no ‘readEnvironment()’ built in function or anything) but that doesn’t mean there’s nothing there (are there any undocumented test features built in to specific implementations of sqllite maybe?). But the question you need to consider is: Mounting fully untrusted db files just might not really be a vector sqllite is built to defend against, in which case that puts the onus on you to be sure that the file is as you expected.

Also, where are you left if in the future a new version adds a feature like that?

ANY mechanism along these lines that lets a sqllite db pull in environment or file system data would make this system exploitable within the bounds of sqllite, even if sqllite contained no ‘vulnerabilities’ like buffer overruns to maliciously crafted files.

And the crazy thing is, these kinds of vectors have shown up in data exchange oriented file formats like XML and YAML, so it’s honestly prudent to assume that in a richer format like sqllite they are almost certainly present until proven otherwise.