|
|
|
|
Programming the Perl DBIDatabase programming with PerlBy Alligator Descartes & Tim Bunce1st Edition February 2000 1-56592-699-4, Order Number: 6994 350 pages, $34.95 |
Chapter 4
Programming with the DBIIn this chapter, we'll discuss in detail the actual programming interface defined by the DBI module. We'll start with the very architecture of DBI, continue with explaining how to use the handles that DBI provides to interact with databases, then cover simple tasks such as connecting and disconnecting from databases. Finally, we'll discuss the important topic of error handling and describe some of the DBI's utility methods and functions. Future chapters will discuss how to manipulate data within your databases, as well as other advanced functionality.
DBI Architecture
The DBI architecture is split into two main groups of software: the DBI itself, and the drivers. The DBI defines the actual DBI programming interface, routes method calls to the appropriate drivers, and provides various support services to them. Specific drivers are implemented for each different type of database and actually perform the operations on the databases. Figure 4-1 illustrates this architecture.
Figure 4-1. The DBI architecture
![]()
Therefore, if you are authoring software using the DBI programming interface, the method you use is defined within the DBI module. From there, the DBI module works out which driver should handle the execution of the method and passes the method to the appropriate driver for actual execution. This is more obvious when you recognize that the DBI module does not perform any database work itself, nor does it even know about any types of databases whatsoever. Figure 4-2 shows the flow of data from a Perl script through to the database.
Figure 4-2. Data flow through DBI
![]()
Under this architecture, it is relatively straightforward to implement a driver for any type of database. All that is required is to implement the methods defined in the DBI specification,[1] as supported by the DBI module, in a way that is meaningful for that database. The data returned from this module is passed back into the DBI module, and from there it is returned to the Perl program. All the information that passes between the DBI and its drivers is standard Perl datatypes, thereby preserving the isolation of the DBI module from any knowledge of databases.
The separation of the drivers from the DBI itself makes the DBI a powerful programming interface that can be extended to support almost any database available today. Drivers currently exist for many popular databases including Oracle, Informix, mSQL, MySQL, Ingres, Sybase, DB2, Empress, SearchServer, and PostgreSQL. There are even drivers for XBase and CSV files.
These drivers can be used interchangeably with little modification to your programs. Couple this database-level portability with the portability of Perl scripts across multiple operating systems, and you truly have a rapid application development tool worthy of notice.
Drivers are also called database drivers, or DBDs, after the namespace in which they are declared. For example, Oracle uses
DBD::Oracle, Informix usesDBD::Informix, and so on. A useful tip in remembering the DBI architecture is that DBI can stand for DataBase Independent and DBD can stand for DataBase Dependent.Because DBI uses Perl's object-orientation features, it is extremely simple to initialize DBI for use within your programs. This can be achieved by adding the line:
use DBI;to the top of your programs. This line locates and loads the core DBI module. Individual database driver modules are loaded as required, and should generally not be explicitly loaded.
Handles
The DBI defines three main types of objects that you may use to interact with databases. These objects are known as handles. There are handles for drivers, which the DBI uses to create handles for database connections, which, in turn, can be used to create handles for individual database commands, known as statements. Figure 4-3 illustrates the overall structure of the way in which handles are related, and their meanings are described in the following sections.
Figure 4-3. DBI handles
![]()
Driver Handles
Driver handles represent loaded drivers and are created when the driver is loaded and initialized by the DBI. There is exactly one driver handle per loaded driver. Initially, the driver handle is the only contact the DBI has with the driver, and at this stage, no contact has been made with any database through that driver.
The only two significant methods available through the driver handle are
data_sources(), to enumerate what can be connected to, andconnect(), to actually make a connection. These methods are more commonly invoked as DBI class methods, however, which we will discuss in more detail later in this chapter.Since a driver handle completely encapsulates a driver, there's no reason why multiple drivers can't be simultaneously loaded. This is part of what makes the DBI such a powerful interface.
For example, if a programmer is tasked with the job of transferring data from an Oracle database to an Informix database, it is possible to write a single DBI program that connects simultaneously to both databases and simply passes the data backwards and forwards as needed. In this case, two driver handles would be created, one for Oracle and one for Informix. No problems arise from this situation, since each driver handle is a completely separate Perl object.
Within the DBI specification, a driver handle is usually referred to as
$drh.Driver handles should not normally be referenced within your programs. The actual instantiation of driver handles happens ``under the hood'' of DBI, typically when
DBI->connect()is called.Database Handles
Database handles are the first step towards actually doing work with the database, in that they encapsulate a single connection to a particular database. Prior to executing SQL statements within a database, we must actually connect to the database. This is usually achieved through the DBI's
connect()method:$dbh = DBI->connect( $data_source, ... );The majority of databases nowadays tend to operate in a multiuser mode, allowing many simultaneous connections, and database handles are designed accordingly. An example might be if you wanted to write a stock-monitoring program that simultaneously monitored data in tables within different user accounts in the database. A DBI script could make multiple connections to the database, one for each user account, and execute SQL statements on each. Database handles are completely encapsulated objects, meaning that transactions from one database handle cannot ``cross-over'' or ``leak'' into another.
Database handles are children of their corresponding driver handle, which supports the notion that we could also make multiple simultaneous connections to multiple database types, as well as multiple simultaneous connections to databases of the same type. For example, a more complicated DBI script could make two connections to each of an Oracle and an Informix database to perform the above-mentioned monitoring. Figure 4-3, shown earlier, illustrates the capability of having multiple database handles connecting through a driver handle to an Oracle database.
Keep in mind that had the monitoring program been written in C, two copies of code would be required, one for Oracle's programming interface and one for Informix's. DBI levels the playing field.
Within the DBI specification and sample code, database handles are usually referred to
as$dbh.Statement Handles
Statement handles are the final type of object that DBI defines for database interaction and manipulation. These handles actually encapsulate individual SQL statements to be executed within the database.
Statement handles are children of their corresponding database handle. Since statement handles are objects in their own right, data within one statement is protected from tampering or modification by other statement handles.
For a given database handle, there is no practical limit to the number of statement handles that can be created and executed.[2] Multiple statements can be created and executed within one script, and the data can be processed as it returns. A good example of this might be a data-mining robot that connects to a database, then executes a large number of queries that return all sorts of different types of information. Instead of attempting to write convoluted SQL to correlate the information within the database, the Perl script fetches all the data being returned from the many statements and performs analysis there, using the fully featured text and data manipulation routines that Perl has to offer.
Within the DBI specification and sample code, statement handles are generally referred
to as$sth.Data Source Names
When connecting to a database via the DBI, you need to tell the DBI where to find the database to connect to. For example, the database driver might require a database name, or a physical machine name upon which the database resides. This information is termed a data source name, and of all the aspects of DBI, this is possibly the most difficult to standardize due to the sheer number and diversity of connection syntaxes.
The DBI requires the data source name to start with the characters
dbi:, much like a URL begins withhttp:, and then the name of the driver, followed by another colon--for example,dbi:Oracle:. Any text that follows is passed to the driver's ownconnect()method to interpret as it sees fit. Most drivers expect either a simple database name or, more often, a set of one or morename/value pairs separated with semicolons. Some common examples are listed later in this section.For example, mSQL requires the hostname, database name, and potentially, the TCP/IP port number for connecting to the database server. However, Oracle may require only a single word that is an alias to a more complicated connection identifier that is stored in separate Oracle configuration files.
DBI offers two useful methods for querying which data sources are available to you for each driver you have installed on your system.
Firstly, you can get a list of all the available drivers installed on your machine by using the
DBI->available_drivers()method. This returns a list with each element containing the data source prefix of an installed driver,[3] such asdbi:Informix:.Secondly, you can invoke the
DBI->data_sources()method against one or more of the drivers returned by theDBI->available_drivers()method to enumerate which data sources are known to the driver.[4] Calling thedata_sources()method will actually load the specified driver and validate that it is completely and correctly installed. Because DBI dies if it can't load and initialize a driver, this method should be called inside aneval{}blockif you need to catch that error.The following script lists all the drivers and data sources for each driver on your system:
#!/usr/bin/perl -w## ch04/listdsns: Enumerates all data sources and all installed drivers#use DBI;### Probe DBI for the installed driversmy @drivers = DBI->available_drivers();die "No drivers found!\n" unless @drivers; # should never happen### Iterate through the drivers and list the data sources for each oneforeach my $driver ( @drivers ) {print "Driver: $driver\n";my @dataSources = DBI->data_sources( $driver );foreach my $dataSource ( @dataSources ) {print "\tData Source is $dataSource\n";}print "\n";}exit;The output from this script on my machine looks like:
Driver: ADODriver: CSVData source is DBI:CSV:f_dir=megalithsData source is DBI:CSV:f_dir=pictish_stonesDriver: ExamplePData Source is dbi:ExampleP:dir=.Driver: FileData Source is DBI:File:f_dir=megalithsData Source is DBI:File:f_dir=pictish_stonesDriver: ODBCDriver: ProxyDriver: XBaseData Source is dbi:XBase:.which tells us that we have the standard drivers
DBD::Proxy,DBD::ADO,DBD::File, andDBD::ExamplePinstalled, as well asDBD::ODBC,DBD::XBase, andDBD::CSV.While this may be interesting in theory, in practice you rarely need to use these methods. Most applications are written to use one data source name, either hardcoded into the application or passed in as a parameter in some way.
When specifying a data source name for a database, the text following the driver prefix should be of the form that is appropriate for the particular database that you wish to connect to. This is very database-specific, but the following table shows some examples.[5]
Database
Example Connection Syntax
mSQL
dbi:mSQL:hostname:database:port_number
For example, to connect to a database called archaeo located on a machine called fowliswester.arcana.co.uk running on port number 1114, the following
$data_sourceargument would be used:dbi:mSQL:fowliswester.arcana.co.uk:archaeo:1114
Oracle
dbi:Oracle:connection_descriptor
Oracle has a slightly less cut-and-dried way of specifying connection identifiers due to the many different ways in which the Oracle database software can actually handle connections.
To break this nightmarish topic down into bite-sized chunks, Oracle may use two different types of connection. For local connections, Oracle uses a single item of information as the connection descriptor, either the name of the database or an alias to the database as specified in the Oracle configuration files. For a network-based connection, Oracle usually needs to know the alias of the connection descriptor as specified in the Oracle configuration files, or, if you are feeling suitably masochistic, you can specify the whole connection descriptor ... but, believe me, it isn't pretty.
For example, a simple Oracle
$data_sourcevalue might be:dbi:Oracle:archaeo
CSV
dbi:CSV:f_dir=/datafiles
The
DBD::CSVmodule treats a group of comma-separated value files in a common directory as a database. The data source for this driver can contain a parameterf_dirthat specifies the directory in which the files are located.In the case of the
$data_sourceargument, an empty or undefined value will result in the environment variableDBI_DSNbeing checked for a valid value. If this environment variable is not defined, or does not contain a valid value, the DBI will calldie().Connection and Disconnection
The main activity in database programming usually involves the execution of SQL statements within a database. However, to accomplish this task, a connection to a database must be established first. Furthermore, after all the work has been done, it is good manners to disconnect from the database to free up both your local machine resources and, more importantly, valuable database resources.
Connection
In the case of simple databases, such as flat-file or Berkeley DB files, ``connecting'' is usually as simple as opening the files for reading or using the tie mechanism. However, in larger database systems, connecting may be considerably more complicated.
A relatively simple RDBMS is mSQL, which has a simple method of connection: to connect, a program connects to a TCP/IP port on the computer running the database. This establishes a live connection within the database. However, more complex systems, such as Oracle, have a lot more internal security and housekeeping work that must be performed at connection time. They also have more data that needs to be specified by the program, such as the username and password that you wish to connect with.
By looking at a broad spectrum of database systems, the information required to connect can be boiled down to:
- The data source name, a string containing information specifying the driver to use, what database you wish to connect to, and possibly its whereabouts. This argument takes the format discussed in the previous section and is highly database-specific.
- The username that you wish to connect to the database as. To elaborate on the concept of usernames a little further, some databases partition the database into separate areas, called schemas, in which different users may create tables and manipulate data. Users cannot affect tables and data created by other users. This setup is similar to accounts on a multiuser computer system, in that users may create their own files, which can be manipulated by them, but not necessarily by other users. In fact, users may decide to disallow all access to their files, or tables, from all other users, or allow access to a select group or all users.[6]
Most major database systems enforce a similar security policy, usually with an administrator having access to an account that allows them to read, modify, and delete any user's tables and data. All other users must connect as themselves. On these systems, your database username may be the same as your system login username, but it doesn't have to be.
More minimal database systems may not have any concept of username-based authentication, but you still need to supply the username and password arguments, typically as empty strings.
- The password associated with the supplied username.
In light of these common arguments, the syntax for connecting to databases using DBI is to use the
connect()call, defined as follows:$dbh = DBI->connect( $data_source, $username, $password, \%attr );The final argument,
\%attr, is optional and may be omitted.\%attris a reference to a hash that contains handle attributes to be applied to this connection. One of the most important items of the information supplied in this hash is whether or not automatic error handling should be supplied by DBI. We will discuss this in further detail in the following section, but the two common attributes are calledRaiseErrorandPrintError, which cause the DBI to die or print a warning automatically when a database error is detected.This method, when invoked, returns a database handle if the connection has been successfully made to the database. Upon failure, the value
undefis returned.To illustrate the
DBI->connect()method, assume that we have an Oracle database calledarchaeo. To connect to this database, we might use the following code:#!/usr/bin/perl -w## ch04/connect/ex1: Connects to an Oracle database.use DBI; # Load the DBI module### Perform the connection using the Oracle drivermy $dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" )or die "Can't connect to Oracle database: $DBI::errstr\n";exit;This simple example illustrates the use of the
DBI->connect()method to make one connection to the database. We also perform error checking on the call to ensure that the connection occurs; upon failure, the error message will be printed along with the database-specific reason for the failure, which will be contained within the variable$DBI::errstr.[7]A more complicated example might be to connect twice to the same database from within the one script:
#!/usr/bin/perl -w## ch04/connect/ex2: Connects to two Oracle databases simultaneously# with identical arguments. This is to illustrate# that all database handles, even if identical# argument-wise, are completely separate from# one another.use DBI; # Load the DBI module### Perform the connection using the Oracle drivermy $dbh1 = DBI->connect( "dbi:Oracle:archaeo", "username", "password" )or die "Can't make 1st database connect: $DBI::errstr\n";my $dbh2 = DBI->connect( "dbi:Oracle:archaeo", "username", "password" )or die "Can't make 2nd database connect: $DBI::errstr\n";exit;or to connect simultaneously to two different databases. For example:
#!/usr/bin/perl -w## ch04/connect/ex3: Connects to two Oracle databases simultaneously.use DBI; # Load the DBI module### Perform the connection using the Oracle drivermy $dbh1 = DBI->connect( "dbi:Oracle:archaeo", "username", "password" )or die "Can't connect to 1st Oracle database: $DBI::errstr\n";my $dbh2 = DBI->connect( "dbi:Oracle:seconddb", "username", "password" )or die "Can't connect to 2nd Oracle database: $DBI::errstr\n";exit;This former example is quite interesting, because even though we have used identical arguments to
DBI->connect(), the two database handles created are completely separate and do not share any information.A final example of using
DBI->connect()is to connect to two different databases (one Oracle, one mSQL) within the same script. In this case, DBI's automatic error reporting mechanism will be disabled in the mSQL database by passing an attribute hash to theconnect()call, as shown here:#!/usr/bin/perl -w## ch04/connect/ex4: Connects to two database, one Oracle, one mSQL# simultaneously. The mSQL database handle has# auto-error-reporting disabled.use DBI; # Load the DBI module### Perform the connection using the Oracle drivermy $dbh1 = DBI->connect( "dbi:Oracle:archaeo", "username", "password" )or die "Can't connect to Oracle database: $DBI::errstr\n";my $dbh2 = DBI->connect( "dbi:mSQL:seconddb", "username", "password" , {PrintError => 0} )or die "Can't connect to mSQL database: $DBI::errstr\n";exit;The
$usernameand$passwordarguments should be specified but may be empty ('') if not required. As discussed previously, the$data_sourceargument can also be undefined and the value of the environment variableDBI_DSNwill be used instead, if it hasbeen set.Disconnection
Explicit disconnection from the database is not strictly necessary if you are exiting from your program after you have performed all the work, but it is a good idea. We strongly recommend that you get into the habit of disconnecting explicitly.
DBI provides a method through which programmers may disconnect a given database handle from its database. This is good practice, especially in programs in which you have performed multiple connections or will be carrying out multiple sequential connections.
The method for performing disconnections is:
$rc = $dbh->disconnect();According to this definition,
disconnect()is invoked against a specific database handle. This preserves the notion that database handles are completely discrete. With multiple database handles active at any given time, each one must explictly be disconnected.An example of using
disconnect()might look like:#!/usr/bin/perl -w## ch04/disconnect/ex1: Connects to an Oracle database# with auto-error-reporting disabled# then performs an explicit disconnection.use DBI; # Load the DBI module### Perform the connection using the Oracle drivermy $dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" , {PrintError => 0} )or die "Can't connect to Oracle database: $DBI::errstr\n";### Now, disconnect from the database$dbh->disconnector warn "Disconnection failed: $DBI::errstr\n";exit;Upon successful disconnection, the return value will be true. Otherwise, it will be false. In practice, failure to disconnect usually means that the connection has already been lost for some reason. After disconnecting the database handle can't be used for anything worthwhile.
What happens if you don't explicitly disconnect? Since DBI handles are references to Perl objects, Perl's own garbage collector will move in and sweep up any object trash you leave lying around. It does that by calling the object's
DESTROYmethod when there are no longer any references to the object held by your script, or when Perl is exiting.The
DESTROYmethod for a database handle will calldisconnect()for you, if you've left the handle connected, in order to disconnect cleanly from the database. But it will complain about having to do so by issuing a warning:Database handle destroyed without explicit disconnect.A major caveat with the
disconnect()method regards its behavior towards automatically committing transactions at disconnection. For example, if a program has updated data but has not calledcommit()orrollback()before callingdisconnect(), the action taken by different database systems varies. Oracle will automatically commit the modifications, whereas Informix may not. To deal with this, theDESTROYmethod has to callrollback()beforedisconnect()ifAutoCommitis not enabled. In Chapter 6, we'll discuss the effect ofdisconnect()andDESTROYon transactions in more detail.Error Handling
The handling of errors within programs, or the lack thereof, is one of the more common causes of questions concerning programming with DBI. Someone will ask "Why doesn't my program work?" and the answer generally runs along the lines of "Why aren't you performing error checking?" Sure enough, nine out of ten times when error checking is added, the exact error message appears and the cause for error is obvious.
Automatic Versus Manual Error Checking
Early versions of the DBI required programmers to perform their own error checking, in a traditional way similar to the examples listed earlier for connecting to a database. Each method that returned some sort of status indicator as to its success or failure should have been followed by an error condition checking statement. This is an excellent, slightly C-esque way of programming, but it quickly gets to be tiresome, and the temptation to skip the error checking grows.
The DBI now has a far more straightforward error-handling capability in the style of exceptions. That is, when DBI internally detects that an error has occurred after a DBI method call, it can automatically either
warn()ordie()with an appropriate message. This shifts the onus of error checking away from the programmer and onto DBI itself, which does the job in the reliable and tireless way that you'd expect.Manual error checking still has a place in some applications where failures are expected and common. For example, should a database connection attempt fail, your program can detect the error, sleep for five minutes, and automatically re-attempt a connection. With automatic error checking, your program will exit, telling you only that the connection attempt failed.
DBI allows mixing and matching of error-checking styles by allowing you to selectively enable and disable automatic error checking on a per-handle basis.
Manual error checking
Of course, the
DBI still allows you to manually error check your programs and the execution of DBI methods. This form of error checking is more akin to classic C and Perl programming, where each important statement is checked to ensure that it has executed successfully, allowing the program to take evasive action upon failure.DBI, by default, performs basic automatic error reporting for you by enabling the
PrintErrorattribute. To disable this feature, simply set the value to0either via the handle itself after instantiation, or, in the case of database handles, via the attribute hash of theconnect( )method.For example:
### Attributes to pass to DBI->connect( )%attr = (PrintError => 0,RaiseError => 0);### Connect...my $dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" , \%attr );### Re-enable warning-level automatic error reporting...$dbh->{PrintError} = 1;Most DBI methods will return a false status value, usually
undef, when execution fails. This is easily tested by Perl in the following way:### Try connecting to a databasemy $dbh = DBI->connect( ... )or die "Can't connect to database: $DBI::errstr!\";The following program disables automatic error handling, with our own tests to check for errors. This example also moves the attributes into the
connect( )method call itself, a clean style that's commonly used:#!/usr/bin/perl -w## ch04/error/ex1: Small example using manual error checking.use DBI; # Load the DBI module### Perform the connection using the Oracle drivermy $dbh = DBI->connect( undef, "stones", "stones", {PrintError => 0,RaiseError => 0} ) or die "Can't connect to the database: $DBI::errstr\n";### Prepare a SQL statement for executionmy $sth = $dbh->prepare( "SELECT * FROM megaliths" )or die "Can't prepare SQL statement: $DBI::errstr\n";### Execute the statement in the database$sth->executeor die "Can't execute SQL statement: $DBI::errstr\n";### Retrieve the returned rows of datamy @row;while ( @row = $sth->fetchrow_array( )) {print "Row: @row\n";}warn "Data fetching terminated early by error: $DBI::errstr\n"if $DBI::err;### Disconnect from the database$dbh->disconnector warn "Error disconnecting: $DBI::errstr\n";exit;As can be seen from the example, the code to check the errors that may have arisen in a DBI method is actually longer than the code to perform the operations themselves. Similarly, it is entirely possible that you may just genuinely forget to add a check after a statement, which may result in extremely bizarre program execution and error reporting, not to mention hours of wasted debugging time!
Automatic error checking
The automatic error checking capabilities of the DBI operates on two levels. The
PrintErrorhandle attribute tells DBI to call the Perlwarn( )function (which typically results in errors being printed to the screen when encountered) and theRaiseErrorhandle attribute (which tells DBI to call the Perldie( )function upon error, typically causing the script to immediately abort).Because the standard Perl functions of
warn( )anddie( )are used, you can change the effects ofPrintErrorandRaiseErrorwith the$SIG{_ _WARN_ _}and$SIG{_ _DIE_ _}signal handlers. Similarly, adie( )fromRaiseErrorcan be caught viaeval { ... }.These different levels of automatic error checking can be turned on for any handle, although database handles are usually the most common and useful. To enable the style of automatic error checking you want, you may set the value of either of the following two attributes:
$h->{PrintError} = 1;$h->{RaiseError} = 1;Similarly, to disable automatic error checking, simply set the value of these attributes to
0.If both
RaiseErrorandPrintErrorare enabled, an error will causewarn( )anddie( )to be executed sequentially. If no$SIG{_ _DIE_ _}handle has been defined,warn( )is skipped to avoid the error message being printed twice.[8]A more common way in which these attributes are used is to specify them in the optional attribute hash supplied to
DBI->connect( )when connecting to a database. Automatic error checking is the recommended style in which to write DBI code, soPrintErroris enabled by default inDBI->connect( ). You can think of this as training wheels for novices and grease for quick-and-dirty script writers. Authors of more significant works usually either enableRaiseErroror disablePrintErrorand do their own error checking.The following short example illustrates the use of
RaiseErrorinstead of manual error checking is:#!/usr/bin/perl -w## ch04/error/ex2: Small example using automatic error handling with# RaiseError, i.e., the program will abort upon detection# of any errors.use DBI; # Load the DBI modulemy ($dbh, $sth, @row);### Perform the connection using the Oracle driver$dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" , {PrintError => 0, ### Don't report errors viawarn( )RaiseError => 1 ### Do report errors viadie( )} );### Prepare a SQL statement for execution$sth = $dbh->prepare( "SELECT * FROM megaliths" );### Execute the statement in the database$sth->execute( );### Retrieve the returned rows of datawhile ( @row = $sth->fetchrow_array( )) {print "Row: @row\n";}### Disconnect from the database$dbh->disconnect( );exit;This example is both shorter and more readable than the manual error checking shown in a following example. The actual program logic is clearer. The most obvious additional benefit is that we can forget to handle error checking manually after a DBI operation, since the DBI will check for errors for us.
Mixed error checking
You can mix error checking styles within a single program, since automatic error checking can be easily enabled and disabled on a per-handle basis. There are plenty of occasions where mixed error checking is useful. For example, you might have a program that runs continuously, such as one that polls a database for recently added stock market quotes every couple of minutes.
Disaster occurs! The database crashes! The ideal situation here is that the next time the program tries connecting to the database and fails, it'll wait a few minutes before retrying rather than aborting the program altogether. Once we've connected to the database, the error checking should now simply warn when a statement fails and not die.
This mixed style of error checking can be broken down into two areas: manual error checking for the
DBI->connect( )call, and automatic error checking viaPrintErrorfor all other statements. This is illustrated in the following example program:#!/usr/bin/perl -w## ch04/error/mixed1: Example showing mixed error checking modes.use DBI; # Load the DBI module### Attributes to pass to DBI->connect( )to disable automatic### error checkingmy %attr = (PrintError => 0,RaiseError => 0,);### The program runs forever and ever and ever and ever ...while ( 1 ) {my $dbh;### Attempt to connect to the database. If the connection### fails, sleep and retry until it succeeds ...until ($dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" , %attr )) {warn "Can't connect: $DBI::errstr. Pausing before retrying.\n";sleep( 5 * 60 );}eval { ### Catch _any_ kind of failures from the code within### Enable auto-error checking on the database handle$dbh->{RaiseError} = 1;### Prepare a SQL statement for executionmy $sth = $dbh->prepare( "SELECT stock, value FROM current_values" );while (1) {### Execute the statement in the database$sth->execute( );### Retrieve the returned rows of datawhile ( my @row = $sth->fetchrow_array( )) {print "Row: @row\n";}### Pause for the stock market values to movesleep 60;}};warn "Monitoring aborted by error: $@\n" if $@;### Short sleep here to avoid thrashing the databasesleep 5;}exit;This program
demonstrates that with DBI, you can easily write explicit error checking and recovery code alongside automatic error checking.Error Diagnostics
The ability to trap errors within the DBI is very useful, with either manual or automatic error checking, but this information is only marginally useful on its own. To be truly useful, it is necessary to discern exactly what the error was in order to track it down and debug it.
To this end, DBI defines several error diagnostic methods that can be invoked against any valid handle, driver, database, or statement. These methods will inform the programmer of the error code and report the verbose information from the last DBI method called. These are:
$rv = $h->err();$str = $h->errstr();$str = $h->state();These various methods return the following items of information that can be used for more accurate debugging of errors:
$h->err()returns the error number that is associated with the current error flagged against the handle$h. The values returned will be completely dependent on the values produced by the underlying database system. Some systems may not support particularly meaningful information; for example, mSQL errors always have the error number of-1. Oracle is slightly more helpful: a connection failure may flag an ORA-12154 error message upon connection failure, which would return the value of12154by invoking$h->err(). Although this value is usually a number, you should not rely on that.
$h->errstr()is a slightly more useful method, in that it returns a string containing a description of the error, as provided by the underlying database. This string should correspond to the error number returned in$h->err().
For example, mSQL returns
-1as the error number for all errors, which is not particularly useful. However, invoking$h->errstr()provides far more useful information. In the case of connection failure, the error:ERROR : Can't connect to local MSQL servermight be generated and returned by
$h->errstr(). Under Oracle, a connection failure returning the error number of12154will return the following string as its descriptive error message:ORA-12154: TNS:could not resolve service name (DBD ERROR: OCIServerAttach)$h->state()returns a string in the format of the standard SQLSTATE five-character error string. Many drivers do not fully support this method, and upon invoking it to discern the SQLSTATE code, the value:
S1000will be returned. The specific general success code
00000is translated to0, so that if no error has been flagged, this method will return afalsevalue.The error information for a handle is reset by the DBI before most DBI method calls. Therefore, it's important to check for errors from one method call before calling the next method on the same handle. If you need to refer to error information later you'll need to save it somewhere else yourself.
A rewriting of the previous example to illustrate using the specific handle methods to report on errors can be seen in the following code:
#!/usr/bin/perl -w## ch04/error/ex3: Small example using manual error checking which also uses# handle-specific methods for reporting on the errors.use DBI; # Load the DBI module### Attributes to pass to DBI->connect()to disable automatic### error checkingmy %attr = (PrintError => 0,RaiseError => 0,);### Perform the connection using the Oracle drivermy $dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" , \%attr )or die "Can't connect to database: ", $DBI::errstr, "\n";### Prepare a SQL statement for executionmy $sth = $dbh->prepare( "SELECT * FROM megaliths" )or die "Can't prepare SQL statement: ", $dbh->errstr(), "\n";### Execute the statement in the database$sth->executeor die "Can't execute SQL statement: ", $sth->errstr(), "\n";### Retrieve the returned rows of datawhile ( my @row = $sth->fetchrow_array()) {print "Row: @row\n";}warn "Problem infetchrow_array(): ", $sth->errstr(), "\n"if $sth->err();### Disconnect from the database$dbh->disconnector warn "Failed to disconnect: ", $dbh->errstr(), "\n";exit;As you can see, it's even more long-winded than using the
$DBI::errstrvariable, which can at least be interpolated directly into the error messages.In addition to these three methods, which allow finely grained error checking at a handle level, there are three corresponding variables that will contain the same information, but at a DBI class level:
$DBI::err$DBI::errstr$DBI::stateUse of these variables is essentially the same as that of
$h->err()and friends, but the values referred to are for the last handle used within DBI. They are particularly handy for interpolating into strings for error messages.Since these variables are associated with the last handle used within the DBI, they have an even shorter lifespan than the handle error methods, and should be used only immediately after the method call that failed. Otherwise, it is highly likely they will contain misleading error information.
The one case where the variables are very useful is for connection errors. When these errors occur, there's no new handle returned in which to hold error information. Since scripts don't use the internal driver handles, the
$DBI::errstrvariable provides a very simple and effective way to get the error message from aconnect()failure.In summary, for most applications, automatic error checking using
RaiseErrorand/orPrintErroris recommended. Otherwise, manual checking can be used and$DBI::errstrcan easily be interpolated into messages. The handle methods are available for more complex applications.Utility Methods and Functions
To round off our basic introduction to DBI, we'll tell you about some useful utility methods and functions that will make your life that little bit easier. These include the very useful quote escaping method, DBI execution tracing, and various functions to tidy up your data.
Database-Specific Quote Handling
By far the most important utility method is
quote(), which correctly quotes and escapes SQL statements in a way that is suitable for a given database engine. This feature is important if you have a Perl string that you wish to insert into a database, as the data will be required, in most cases, to have quotation marks around it.To confuse matters, database engines tend to have a different format for specifying these surrounding quotation marks. DBI circumvents this problem by declaring the
quote()method to be executed against a database handle, which ensures that the correct quotation rules are applied.This method, when executed against a database handle, converts the string given as an argument according to defined rules, and returns the correctly escaped string for use against the database.
For example:
#!/usr/bin/perl -w## ch04/util/quote1: Demonstrates the use of the $dbh->quote()methoduse DBI;### The string to quotemy $string = "Don't view in monochrome (it looks 'fuzzy')!";### Connect to the databasemy $dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" , {RaiseError => 1} );### Escape the string quotes ...my $quotedString = $dbh->quote( $string );### Use quoted string as a string literal in a SQL statementmy $sth = $dbh->prepare( "SELECT *FROM mediaWHERE description = $quotedString" );$sth->execute();exit;For example, if you quoted the Perl string of
Doit!via an Oracle database handle, you would be returned the value of'Doit!'. However, thequote()method also takes care of cases such asDon'tdoit!which needs to be translated to'Don''tdoit!'for most databases. The simplistic addition of surrounding quotes would have produced'Don'tdoit!'which is not a valid SQL string literal.Some databases require a more complex
quote()method, and some drivers (though not all) have aquote()method that can cope with multiline strings and even binary data.As a special case, if the argument is
undef, thequote()method returns the stringNULL, without quotes. This corresponds to the DBI's use ofundefto representNULLvalues, and to howNULLvalues are used in SQL.Tracing DBI Execution
DBI sports an extremely useful ability to generate runtime tracing information of what it's doing, which can be a huge time-saver when trying to track down strange problems in your DBI programs.
At the highest level, you can call the
DBI->trace()method, which enables tracing on all DBI operations from that point onwards. There are several valid tracing levels:
0 Disables tracing. 1 Traces DBI method execution showing returned values and errors. 2 As for 1, but also includes method entry with parameters. 3 As for 2, but also includes more internal driver trace information. 4 Levels 4, and above can include more detail than is helpful. The
trace()method can be used with two argument forms, either specifying only the trace level or specifying both the trace level and a file to which the trace information is appended. The following example shows the use ofDBI->trace():#!/usr/bin/perl -w## ch04/util/trace1: Demonstrates the use of DBI tracing.use DBI;### Remove any old trace filesunlink 'dbitrace.log' if -e 'dbitrace.log';### Connect to a databasemy $dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" );### Set the tracing level to 1 andprepare()DBI->trace( 1 );doPrepare();### Set trace output to a file at level 2 andprepare()DBI->trace( 2, 'dbitrace.log' );doPrepare();### Set the trace output back to STDERR at level 2 andprepare()DBI->trace( 2, undef );doPrepare();exit;### prepare a statement (invalid to demonstrate tracing)sub doPrepare {print "Preparing and executing statement\n";my $sth = $dbh->prepare( "SELECT * FROM megalith" );$sth->execute();return;}exit;This program generates quite a bit of trace information, of which we'll show just a small fragment:
-> prepare for DBD::Oracle::db (DBI::db=HASH(0xcd45c)~0xcd4a4 'SELECT * FROM megalith') thr0<- prepare= DBI::st=HASH(0xcd648) at trace1 line 30.-> execute for DBD::Oracle::st (DBI::st=HASH(0xcd648)~0x16afec) thr0dbd_st_execute SELECT (out0, lob0)...!! ERROR: 942 'ORA-00942: table or view does not exist (DBD ERROR: OCIStmtExecute)'<- execute= undef at trace1 line 33.DBD::Oracle::st execute failed: ORA-00942: table or view does not exist (DBD ERROR: OCIStmtExecute) at trace1 line 33.This trace information was generated with a setting of level 2, and shows the operations that DBI undertook when trying to prepare and execute a statement. Lines prepended with
-> are written when the method is being entered, and lines prepended with <-are written when the method is returning. These lines also show the information being returned from the method call. The DBI trace output is indented by four spaces to make it easier to distinguish the trace output from any other program output.You can see the
prepare()method being called along with its parameters: a database handle and the SQL statement to prepare.[9] The next line shows theprepare()returning a statement handle. It also shows the file and line number thatprepare()was called from. Following that, we seeexecute()being called, a trace line from the driver itself, and the method returning after logging an error. Finally we see the warning generated by the DBI due to thePrintErrorattribute, which is on by default.The trace information generated at level 1 is similar. The main difference is that the method entry lines (
->) are not shown.The one drawback to this form of tracing is that if your program uses a lot of handles, then the volume of tracing information could be quite vast. Similarly, you might have tracked your problem down to a specific database operation that you'd like to trace individually.
The
trace()method is also available at a handle level, allowing you to individually trace any database and statement handle operations. Therefore, you could trace operations on a given database handle to level 1 and a single statement handle to level 2. For example:### Connect to a database...my $dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" );### Trace the database handle to level 1 to the screen$dbh->trace( 1 );### Create a new statementmy $sth = ...;### Trace the statement to level 2 to the file 'trace.lis'$sth->trace( 2, 'trace.lis' );Note that if a filename is specified when calling
trace(), then currently, trace output from all handles is redirected to that file.If your programs are exhibiting odd behavior or are generating errors on a regular basis, you should consider using the built-in tracing features of DBI to help you resolve your problems. This tool is extremely useful, as you will be able to see exactly what data is being passed to the database, allowing you to ensure that it's in the correct format.
Finally, tracing can also be controlled via the use of an environment variable called
DBI_TRACE, which acts in a similar manner to theDBI->trace()method. That is, it traces all handles used within the program. This environment variable can be used in three ways that are summarized in the followingtable.
DBI_TRACE Value
Effect on DBI
1
DBI->trace(1);
dbitrace.log
DBI->trace(2, 'dbitrace.log');
4=dbitrace.log
DBI->trace(4, 'dbitrace.log');
If the trace level isn't specified in the
DBI_TRACEenvironment variable, it will default to2, as shown in the table above.Neat and Tidy Formatting
The DBI features a couple of utility functions that can be used to tidy up strings into a form suitable for easy reading. These two functions are
neat()andneat_list(), the former operating on a single scalar value, the latter operating on a list of scalar values.For example, to use
neat()to tidy up some strings, you could write:#!/usr/bin/perl -w## ch04/util/neat1: Tests out the DBI::neat()utility function.#use DBI;### Declare some strings to neatifymy $str1 = "Alligator's an extremelyneat()and tidy person";my $str2 = "Oh no\nhe's not!";### Neatify this first string to a maxlen of 40print "String: " . DBI::neat( $str1, 40 ) . "\n";### Neatify the second string to a default maxlen of 400print "String: " . DBI::neat( $str2 ) . "\n";### Neatify a numberprint "Number: " . DBI::neat( 42 * 9 ) . "\n";### Neatify an undefprint "Undef: " . DBI::neat( undef ) . "\n";exit;which generates the output of:
String: 'Alligator's an extremelyneat()and...'String: 'Oh nohe's not!'Number: 378Undef: undefdemonstrating that string values are quoted,[10] whereas values known to be numeric are not. The first string has been truncated to the desired length with
...added. Undefined values are recognized and returned as the stringundefwithout quotes.While the
neat()function is handy for single values, theneat_list()function is handy for lists. It simply callsneat()on each element of the referenced list before joining the list of values together with the desired separator string. For example:#!/usr/bin/perl -w## ch04/util/neat2: Tests out the DBI::neat_list()utility functionuse DBI qw( neat_list );### Declare some strings to neatifymy @list = ( 'String-a-string-a-string-a-string-a-string', 42, 0, '', undef );### Neatify the strings into an arrayprint neat_list( \@list, 40, ", " ), "\n";exit;which generates the output of:
'String-a-string-a-string-a-string-a...', 42, 0, '', undefThis example also shows that the utility functions can be imported into your package so you can drop the
DBI::prefix.DBI uses
neat()andneat_list()internally to format the output generated by tracing. That's important to know if you're wondering why the trace output is truncating your huge SQL statementsdownto 400 characters.[11]Numeric Testing
The final utility function supplied by DBI that we'll look at is quite a curious one called
looks_like_number(). This function quite simply tells you whether or not a value looks like a number or not.
looks_like_number()operates by taking a list of values as an argument and returns a new array signifying whether or not the corresponding value within the original array was a number, not a number, or undefined.This may seem rather a curious thing to want to do, but in the case of handling large quantities of data, it's useful for working out which values might need to have their quotes escaped via the
quote()method.The returned array will contain the same number of values as the original data array, with the elements containing one of three values signifying the following:
true The original value is a number.false The original value is not a number.undef The original value is empty or undefined.The following example illustrates how this process works:
#!/usr/bin/perl -w## ch04/util/lookslike1: Tests out the DBI::looks_like_number()function.#use DBI;### Declare a list of valuesmy @values = ( 333, 'Choronzon', 'Tim', undef, 'Alligator', 1234.34,'Linda', 0x0F, '0x0F', 'Larry Wall' );### Check to see which are numbers!my @areNumbers = DBI::looks_like_number( @values );for (my $i = 0; $i < @values; ++$i ) {my $value = (defined $values[$i]) ? $values[$i] : "undef";print "values[$i] -> $value ";if ( defined $areNumbers[$i] ) {if ( $areNumbers[$i] ) {print "is a number!\n";}else {print "is utterly unlike a number and should be quoted!\n";}}else {print "is undefined!\n";}}exit;The results from this program illustrate how the values are treated and shows that hexadecimal values are not treated as numbers:
values[0] -> 333 is a number!values[1] -> Choronzon is utterly unlike a number and should be quoted!values[2] -> Tim is utterly unlike a number and should be quoted!values[3] -> undef is undefined!values[4] -> Alligator is utterly unlike a number and should be quoted!values[5] -> 1234.34 is a number!values[6] -> Linda is utterly unlike a number and should be quoted!values[7] -> 15 is a number!values[8] -> 0x0F is utterly unlike a number and should be quoted!values[9] -> Larry Wall is utterly unlike a number and should be quoted!The first
0x0Fin the list of values is reported as looking like a number because Perl converted it into one (15) when the script was compiled. The second is not reported as looking like a number because thelooks_like_number()function only looks for integers and floating-point numbers.And that brings us to the end of the introduction to DBI and its architecture. We'll be talking more on how to actually do stuff with DBI in the next chapter.
1.Few methods actually need to be implemented since the DBI provides suitable defaults for most of them. The
DBI::DBDmodule contains documentation for any intrepid driver writers.2. In reality, the number of concurrent statement handles is dependent on the underlying database. For information on how many concurrent statement handles your database can support, see Appendix B.
3. The actual definition of ``installed driver'' is a little loose. The DBI simply searches the directories in
@INClooking for any DBD subdirectories that contain .pm files. Those are assumed to be drivers. It does not verify that the modules are completely and correctly installed. In practice, this process is fast and works well.4. Note that not necessarily every data source that is reachable via the driver is returned. Similarly, the inclusion of a data source does not imply that it is actually currently available for connection.
5. An excellent example of an application that figures out data source names at runtime is
dbish, discussed more fully in Chapter 8.6. In general, this is true. However, some database systems, such as MySQL, support different users but only one schema.
7. Actually, the error message will be displayed twice for reasons that will be explained in the "Error Handling" section later in this chapter.
8. The exact behavior when both attributes are set may change in future versions. This is something to consider if the code is inside an
eval.9. If the Perl you are using was built with threading enabled, then each method entry line also shows the thread number, e.g.,
thr0. The DBI implements a per-driver mutex so that each DBD driver may only be entered by one thread at a time. Trace levels 4 and above show this in action.10. Note that internal quotes are not escaped. That's because
neat()is designed to produce output for human readers, and to do so quickly since it's used by the internal trace mechanisms. If you wish quote escaping to occur, you could use thequote()method instead.11. 400 characters is the default value for the
$DBI::neat_maxlenvariable, which defines the default maximum length for theneat()function.
Back to: Programming the Perl DBI
© 2001, O'Reilly & Associates, Inc.
webmaster@oreilly.com