Rethinking the ODBC Dialect: Inline Parameters

The code the ODBC extension is based on was cloning Rebol2's model of thinking of the database services as being a PORT!. You would run queries by INSERT-ing SQL into the port, and then COPY the port to get a BLOCK! of results back. For anyone unfamiliar:

When you passed a BLOCK! as the thing to INSERT to the port, that block would have a string in the first position...this was the SQL. But if that SQL had ? characters in it (that weren't part of a string literal), then those represented substitution slots. The block then was supposed to have as many additional values in it as ?... and these Rebol values would be filled in as parameters.

This approach is fundamental to how ODBC works (the ? syntax is an ODBC standard). It means that you don't have to turn every value you have into a string to insert it. It saves you from worrying about how to escape those string values, may be more efficient (for some types), and protects against SQL Injection Attacks.

We Can Do Better, and I've Made a Start At It

The idea I had was:

  1. Perform implicit evaluation by default
  2. Assume string components should be implicitly SPACED
  3. Use the new @xxx types to let you put the substitution variables in the spots you want

The first two points will be familiar to anyone who has used PRINT. But the third point means substitutions would be easier to see at the point they are intended to be:

odbc-execute statement [{foo} @value1 {bar} @(expression two) {baz}]
; ...acts like...
insert statement compose [{foo ? bar ? baz} (value1) (expression two)]

The nice things the @ values bring into the mix is that since they do not reduce, you can distinguish them from basic code you're using to try and form the SQL expression from sanitized variables...while still using both.

Behold, More State-of-the-Art

Here's a little wrapper called ODBC-EXECUTE:

odbc-execute: function [
    statement [port!]
    query [text! block!]
    /parameters [block!] "Usually implicitly derived from @params in query"
    parameters: default [copy []]
    if block? query [
        query: spaced map-each item reduce query [  ; see note (1)
            switch type of item [
                sym-word! sym-path! [get item]
                sym-group! [reduce as group! item]
            ] then value => [  ; see note (2)
                append parameters value
            ] else [
                quote :item  ; see note (3)
    insert statement compose [(query) ((parameters))]  ; see note (4)

Pretty brilliant, IMNSHO. :stuck_out_tongue_winking_eye:

  1. We want to REDUCE first, before we go hunting for @params and assuming they are parameters, since they might be arguments to functions. Think of it a bit like how print ["a" first "bc" "d"] can't a-priori know that "bc" is a string to be printed. It has to evaluate, and "bc" disappears as a function the same way, an @xxx might be consumed and not be intended as a parameter. (This may or may not be desirable...but again it's up to the dialect designer, who has freedom to decide how liberal they want to be.)

  2. This is a great example of how to use the value vs. null distinction to avoid repeating code. The branches don't have to repeat the append parameters or the fact that they want ? to wind up in the query string, thanks to the protocol. I'm fibbing a little here, as right now you have to say then (value => [...]) and not just then value => [...] -- but I'm pretty sure that is fixable, it's a rightward-quoting vs. leftward-quoting case where the code needs to let the => on the right win over the then on the left.

  3. This shows generalized quoting being used to get out of a potential jam. The problem is, we REDUCE'd our dialect items before we went looking through them for our @params (I explained why in note 1). But then we want to use SPACED to get spacing, and it reduces by default. There should probably be a way to use DELIMIT and its variants without reducing, but no such entry point exists at the to avoid double-reduction we just quote everything so the evaluation is a no-op.

  4. Just wanted to say I've been really a big fan of the "don't splice by default" for blocks in COMPOSE. Once you get used to the idea that values will be put in as-is unless you use (( )) it really is superior.

Not only do you see several Ren-C-specific tech points making this work, you can also specialize it to make it nicer. I picked a clunky name precisely because I don't know how the PORT! interface should work, but just do this:

sql: specialize 'odbc-execute [statement: statement]

Someday soon that should be able to be sql: (=> odbc-execute statement) as point-free specialization. But maybe it should be sql/execute and that's clearer. Who knows.

Questions, Comments, Concerns?

As a best practices question, one might wonder if it shouldn't be easier to splice unsanitized variables into the SQL than to splice sanitized ones. You might flip it so that WORD! means parameter, and SYM-WORD! means unsafe use...or get even more strict and suggest anyone wanting to put in string components has to do it with an explicit COMPOSE before ODBC-EXECUTE is called.

I feel that's a bit overkill and would frustrate those doing casual hacking on their databases. But the great thing is that if we just build this as a tool on top of a fundamental ODBC operation then people with different opinions can be more aggressive in their versions of the dialect. It's not a big deal if the dialect is easy to make. ミ★

1 Like

Huge thumbs up. I personally did not like the ? ? ? parameter juggling throughout that dialect-- it made code difficult to read. In the 90's I liked ColdFusion's quick/legible approach, which isn't too far I think from the improvement here. (For the youngsters, ColdFusion is/was a templating system using a markup language, i.e., a dialect). At the time CF was criticized for not being a real programming language, but their approach was very easy to understand compared to CGI and quite acceptable performance-wise. It was commercially successful. Some examples: