GeneWeb

gw2sql

A GeneWeb to SQL converter

by Yann Corno - last update: 2009-06-07

 

Contents

  1. Introduction
  2. Usage
  3. Example
  4. Theory
  5. Download
  6. Changes History
  7. Special Thanks
  8. Legal Stuff
  9. Contacting me

Introduction

gw2sql converts a GeneWeb GW file into an SQL file that can be read in any SQL database. GW files, called "GeneWeb Source Files", are created by the gwu tool that is included with GeneWeb. See the GeneWeb documentation for details about this file format.

It is a PERL script, so it can be used on any platform. My original purpose was to have an automatic job running once a day that would dump all my GeneWeb database into MySQL, so that information could be processed easily: running statistics, compiling information, running other scripts (like a birthday mailing list), etc.

Usage

	perl gw2sql.pl [-i infile.gw] [-o outfile.sql] [--nocreate|-n] [--nodata] [--mysql3] [--relax]

Parameters in [brakets] are optional.

-i yourfile.gw name of your GW input file. If omitted, standard input is used.
-o yourfile.sql name of the output SQL file. If omitted, standard output is used.
-n, --nocreate do not ouput the tables creation instructions.
--nodata any gw file or standard input. Use this option to only output the table creation instructions. Useful when using sql2gw.
--mysql3 use if you have an UTF-8 encoded file, and that your SQL database does not support the ALTER TABLE db DEFAULT CHARACTER SET uft8; instruction (supported by MySQL only from v. 4.0).
--relax skip lines that could not be understood (juts issue a warning). By default, it fails when an unknown line is encountered.

Example

The following example makes a complete transfer of a GeneWeb database into a fresh MySQL one (under Unix or similar):

	gwu YourGeneWebBase | perl gw2sql.pl | mysql yourDB

Windows users will have to use files:

	gwu YourGeneWebBase -o YourGeneWebBase.gw
	perl gw2sql.pl -i YourGeneWebBase.gw -o YourGeneWebBase.sql
	Then import YourGeneWebBase.sql in your SQL database.
	

Theory

gw2sql converts a GeneWeb GW file into an SQL file that can be read in any SQL database.

It starts by printing a list of table creation instructions, defining the tables structures. (careful: there are DROP TABLES instructions, so previous versions of these tables will be trashed). You may want to modify this code to suit your own needs, such as adding indexes in these tables, or changing their prefix (see the $dbprefix variable), etc.

There are several tables created:

About referencing a person:
Because the GW format does not provide the person GeneWeb ID, :-( we have to rely on the uniqueness of the triplet Lastname + Fistname + Number. Fortunately, GeneWeb makes a good job at it in its GW format, so it works pretty well. So the database solution was to create a primary key in the gw_person table that combines the 3 values together.

About dates:
As you might know if you read the GW documentation, the GeneWeb format for dates is quite extensive. To address most uses, the dates fields are doubled. The GeneWeb date is put 'as is' in a first field. Then, if the format is a simple and complete one (i.e. dd/mm/yyyy), it is copied in SQL format in a second field. See the 'birth_date' and 'birth_date_sql' fields as an example. These '*_sql' fields are indexed, and will allow easy searches. If the GeneWeb date could not be converted, the SQL field contains NULL.

About unknown persons:
GeneWeb uses a question mark '?' to indicate an unknown last name or first name. To ensure uniqueness of persons, it is assumed that each such person is a different one from the others. Therefore, the last name and/or first name are given a unique number after the '?'. So you will end up with gw_person records that have last names like '?1234', etc.

Note:
This first version is not intended to go deeply into the GeneWeb data structures (like dates). It is merely a translation tool, and the database structure is really not sophisticated. As you will see, you end up with hollow files (lots of NULL fields). It is then up to you to use your own SQL skills to extract or compile the information in a way that fits your needs. SQL is quite convenient for this.

Download

Download gw2sql.pl (40 KB)

Changes History

See header of script file.

Special Thanks

Special thanks to my friend Erich from Germany, who kindly tested my script, uncovering many bugs. There is no good programmer without a good tester, and Erich is a great one!

Legal Stuff

This program is made available under the GeneWeb GPL Licence. GeneWeb is a software developed by Daniel de Rauglaudre, and is © Copyright INRIA. You may change and distribute this work as long as you comply with the GeneWeb GPL Licence.

IMPORTANT NOTICE: THIS WORK IS PROVIDED FOR EXPERIMENTAL PURPOSES ONLY. THERE ARE ABSOLUTELY NO WARRANTY WHATSOEVER THAT COMES WITH THIS SOURCE CODE AND THE INFORMATION DESCRIBED IN THIS PAGE. YOU MAY LOOSE YOUR DATA AND/OR HAVE COMPUTER HARDWARE OR SOFTWARE DAMAGES, OR SECURITY BREACHES. YOU ARE USING ALL THIS AT YOUR OWN RISKS. YOU HAVE BEEN WARNED !

Contacting me

You feed back or source code contributions are welcome! Contact me at . Have fun.

Yann Corno

Back to Yann's GeneWeb Tools