Parser.

VTK library is very HUGE. It contains about 700 classes each having about 10-100 methods. Writing a wrapper by hand is practically impossible. So it is of great importance to have a good parser. I used a two-stage parser. At the first stage I apply gccxml translator provided by Kitware. This is an extension to gcc compiler which "compiles" code into XML files. Then these files can be parsed by any xml-parser to generate the code suitable to be incorporated into R. I use R for this purpose with XML package (thanks to D.T.Lang). This is not an optimal choice, because the parser is quite slow, and processing of the whole set of VTk code takes at my Intel2400  about 3 hours.

gccxml.

Installation of gccxml is described by Kitware. To use it you need the VTK source and you should be sure the VTk is compilable and installable at the system where you want to use gccxml. After the installation of VTk is complete I use tcl-script which lists the content of the current working directory, do regular expression matching for the files like (.*)Python.cxx and then compiles the original file \1.cxx. Here I want to convert only classes which is already wrapped into Python. The script is have to be run in the four directories contained in the VTk source tree - Common/, Rendering/, Filtering/ and Graphics/. Five other sublibraries of VTK (IO/, Imaging/, MPI/, Hybrid/ and Patented/) are not yet included into RVTK.

Classes and methods.

After the VTK code is processed with gccxml we have a large number of XMl files (each of them has size about 1MB). Each XML file is named after one of the VTK classes and in most cases contains the description of just this class. Our aim now is to extract information about class methods from these files.
This was done by R script which parses XML file and writes the corresponding code which is later will be packed into R library. One type of parser's output is the files with .cc extension and the same name as the name of VTK source file. These files contains a number of functions with the name R_<ClassName>_<MethodName> which wrap the call to the <MethodName> of <ClassName>. If the class has a multicast method, i.e. the number of functions with the same name, but having different number of arguments, then each method instance is hadled separately. That is - in RVTK wrapper different functions would be created having the name R_<ClassName>_<MulticastMethodName> with suffix "_vx" where x counts the instance number. These functions should be callable by R's .Call(...) routines. That means - functions must return SEXP value and all its input parameter must have SEXP type. The number of arguments is one more than in original C++ functions.  The nonmatching parameter (always named obj) is a reference to the C++ class instance wrapped into R external pointer.The task of the wrapping function is to convert SEXPs to the base C types or to pointers to VTK's objects, to call underlying C++ function and return back SEXP even then VTK's method does not return value.
All information needed for that XML file contains. That includes - the method name, number of arguments and theirs types (or the class names), the way arguments are passed to the method (by value, as pointers or weak pointers, as constant values). If the argument is a reference to the function, the things may get more complicated, but luckily VTK seldom use functions as arguments to methods. In almost all of these cases the argument is a function of the form void * func (void *) and is treated by the parser separately. For basic types conversions RVTK provides a number of functions of the form  SEXP <c-type>toSexp(<c-type> *, int dim) and <c-type> * SexpTo<c-type> (SEXP). For conversion of VTK classes (or any C++ class) void * SexpToVTK(SEXP) and VTKToSexp(void * arg, char * typename) are used.
"typename" in VtkToSexp function is used to set the R-Class of returned value (which in fact is a an R-external pointer). Return value from SexpTo<c-type> may be array. The size of the array are taken from the corresponding input SEXP argument's length. The inverse conversion <c-type>ToSexp(...) needs to be explicitely provided  with the information of input pointer's length.

Return Values.

There is not much of a problem with input arguments. All input SEXPs are processed with with SexpTo... functions and returned pointers are passed to the C++ method with dereferencing when needed. The more problematic question is what value should wrapping function return? The first guess (to return the value of C++ method) is not the good one because many of VTK functions return data as side-effect, i.e. through the methods' input arguments. Taking this into account, the return value that the wrapping function does return is an R-list. Beside the value of the method the list also includes all arguments which the method takes as references or pointers not marked as const. The list is named: return value of VTk method has the name ".ret.val", all the rest - the names of arguments prefixed with "."

Dimensions

While making the return SEXP from base C++ types by the means of functions <c-type>ToSEXP the dimension of the first argument should be known. This information isn't included in XML-dump (gcc has now way to find the size of data chunk referenced by the pointer), but there is a refernce to source file where the method was realized. The parser uses that reference to extract the method header and then do reg. exp. matching for patterns like " <arg_name>[.*]" and " <arg_name>[.*][.*]". If the array dimension is found in the header then in wrapper function explicit length coersion is done with SET_LENGTH macro applied on input argument. Multidimensional arguments as well as multiple pointers (the information that the pointers are referenced by pointers can be gathered from XML-dump) are not handled by the parser and methods which requires them are not processed. If argument is not supplied by value and its length can not be extracted from the source file the length of output SEXP is the same as the length of input SEXP argument. So it is up to the end-user to provide R-vectors which have enough space to hold all the data the VTK method would return. Of course this is not safe, because supplying too short vector can easily lead to memory corruption, but there is no other alternatives.
The length of return value of C++ method if this value is a pointer to base C type is read from hints file provided by Kitware programmers. If the length can not be found there the fail-safe length 1 is used.