/usr/lib/R/site-library/RMySQL/TODO is in r-cran-rmysql 0.8-0-2.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | In no particular order:
* MySQL 4.1 introduces a number of enhancements that we should
incorporate into RMySQL. E.g.,
(a) a boolean type (finally!) -- we'll need to get
rid of the coercion to integers in mysqlWriteTable().
(b) prepared statements, which we could implement in RMySQL
with a modest effort by mimicking the implementation in
ROracle.
(c) multiple sql statments per call (what we call "scripts").
For this we'll need to define in DBI a method like
"nextResult" to move over a collection of result sets
created by a "script".
* Embedding MySQL. Add code to handle the embedded MySQL (this
should be pretty easy). This does bring up a configuration
issue. We may be able to easily define a macro based on which
library we're linking against (libmysqlclient or libmysqld).
* Transactions have been available since 4.0 (and optionally before
then). We need to implement them in RMySQL.
* dbApply. Move the dbApply C code into the fetch C code. Extend
dbApply to work with multiple fields and report which field(s)
has/have changed group.
* Allow users to specify whether to transfer all the data to the
MySQL client or not (say, we could add a cache=T/F argument to
fetch). Currently when we execute a query we leave all
the data in the MySQL server and fetch row by row. This has
performance implications (dbExec returns quickly but we incur many
tcp round trips), and we run the risk of the server dropping some
of the output records. We could instead have the server send the
entire result set to the client MySQL, and have the MySQL library
(not R/Splus) cache the result set for us (this can cause a big
increase in the amount of memory that R/Splus is using, although
R itself is not managing the MySQL result set cache proper).
Some advantages of this approach is that small and medium size
result sets could be sped up quite a bit, and the R/S code that
builds up the data.frame where the output goes need not be built
up dynamically but in one go.
* Also, we should allow the user to specify whether to lock or
not the tables we're querying.
* Need to add a test for libz to the configure script.
* Need to look at asynchronous fetches (this brings the issue of
threads and thread safety).
* Move the dbApply code to it's own file dbApply.c so that we can
conditionally compile RMySQL on Splus (currently we abort because
I haven't abstracted out the callback mechanism).
* Should the default for fetch(rs, n = 0) should be n=-1? The issue
is safety (chunking) vs. convenience for small data sets).
A possibility would be to set it to -1, but define an absolute threshold
"max.records" that would cause to stop the fetching at that value
and issue a warning message.
* Benchmark batch size and exec time...
|