< Saving Map Context as an URL

Developer Manual >

30. CartoWeb and WMS Usage Statistics [plugin] (statsReports)

StatsReports plugin allows to visualize statistics on CartoWeb and WMS usage. Results can be shown in a form of tables, charts or maps.

30.1. Import and Reports

Statistics are generated out of CartoWeb's Accounting plugin output or out of Apache logs (for WMS statistics). A Java application proccesses the logs and stores raw data in a PostGreSQL database.

30.1.1. How To Build?

Source of Java application are located in scripts/stats. To build this project, you need:

  • Maven 2 (http://maven.apache.org)
  • Java Runtime Environment JRE >= 1.5
  • A PostgreSQL database

The other external librairies are taken care of by Maven.

Now you must set STATS_DB environment variable. On Linux it will look like:

export STATS_DB="jdbc:postgresql://localhost/MYDB?user=MYUSER&password=MYPASSWORD"

This is needed by post-build automatic tests. Then simply run following build command:

mvn clean install

Warning

Be sure your locale are set to UTF-8, otherwise at least one test will fail (testSimple(org.cartoweb.stats.imports.WmsReaderTest)).

30.1.2. Running Import

CartoWeb logs are imported using the following command (remove end-of-line backslashes on Windows):

java -Xmx1G \
     -cp target/stats-standalone.jar org.cartoweb.stats.imports.Import \
     --initialize --tableName=stats \
     --db="jdbc:postgresql://localhost/MYDB?user=MYUSER&password=MYPASSWORD" \
     --logDir=/my/log/dir/ --format=cartoweb \
     --logRegexp="\d\d\.\d\d\.\d\d\d\d\.log\$"

where:

  • -Xmx1G - tells Java to use a maximum of 1 Gb for the import. This should be enough even for very large log files
  • --initialize - tells the application to create the database. Remove this option to use incremental import once the database has been created
  • --tableName - table prefix. Allows to use different table prefixes in order to have several environments in only one database. For instance, use this option if you want to import both CartoWeb and WMS logs in the same database
  • --db - connection string
  • --logDir - logs directory. Directory will be recursively processed
  • --format - format. Currently values are CartoWeb, SyslogCartoWeb, WMS, SecureWMS, HaproxyWMS, SquidTilecache, VarnishTilecache
  • --logRegExp- regular expression to limit processed file logs to those matching the expression

For WMS imports, you need to specify one of those two options:

  • --mapIdRegExp - regular expression that finds the project name in WMS query. This allows to have several WMS application on one server with the same Apache logs.
  • --mapIdConfig - filename for a .ini file that contains a [mapIDs] section that defines what map ID to set in function of partial string matched against the WMS log entry.

Example for a WMS import with a regular expression:

java -Xmx1G \
     -cp target/stats-standalone.jar org.cartoweb.stats.imports.Import \
     --initialize --tableName=stats_wms \
     --db="jdbc:postgresql://localhost/MYDB?user=MYUSER&password=MYPASSWORD" \
     --logDir=/my/log/apachedir/ --format=wms \
     --logRegexp="\d\d\.\d\d\.\d\d\d\d\.log\$" --mapIdRegExp="GET /wms-([^\/]*)\//"

Example for a WMS import with a configuration file:

java -Xmx1G \
     -cp target/stats-standalone.jar org.cartoweb.stats.imports.Import \
     --initialize --tableName=stats_wms \
     --db="jdbc:postgresql://localhost/MYDB?user=MYUSER&password=MYPASSWORD" \
     --logDir=/my/log/apachedir/ --format=wms \
     --logRegexp="\d\d\.\d\d\.\d\d\d\d\.log\$" --mapIdConfig="mapIDs.ini"

Content of the mapIDs.ini file:

[mapIDs]
GET /ogc-sitn/wms = main
GET /ogc-sitn-annuaire/wms = annuaire
GET /ogc-sitn-v1/wms = v1
GET /ogc-sitj-v1/wms = jv1

For Tilecache imports (SquidTilecache and VarnishTilecache), you need to specify the following option:

  • --tilecacheConfig - filename for a .ini file that contains Tilecache configuration.

Content of the Tilecache configuration file:

[referers]
referer1.com = myproject
referer2.net = myproject

[extent]
xmin = 420000
ymin = 30000
xmax = 900000
ymax = 350000

[resolutions]
0 = 4000
1 = 2500
2 = 1000
3 = 500
4 = 250
5 = 100
6 = 50

[tiles]
size = 256
dpi = 254

Help on the possible parameters for a command can be obtained by starting it without parameters.

30.1.3. Configuring Reports

A report is a set of parameters that defines how data will be aggregated in order to visualize statistics. For instance, parameters include the list of criteria (dimensions) that will be available to the end user. Reports are defined in an INI file. A typical section looks like:

[scale project]
; A comment
label = Stats per project
type = simple
periods.day = 30
periods.month = 12
periods.year = 5
values = pixel, count, countPDF, countDXF
dimensions = project
filters.project = sitn, jura

The fields are:

  • label - report name for the GUI
  • type - type of report (see below)
  • periods.* - how to divide the time (see below)
  • values - what values are stored (see below)
  • dimensions - what criteria user will be able to specify in the GUI (see below)
  • filters.* - what records you want to take into account (see below)

30.1.3.1. Report Types

Currently available types are:

  • simple - graphs or tables
  • gridbbox - colored maps based on the bounding box of the viewed maps
  • gridcenter - colored maps based on the center of the viewed maps

If the type is gridbox or gridcenter, you have to add a few fields in the configuration that will define the position, size and granularity of the grid box. For example:

type = gridbbox
minx = 522000
miny = 187000
size = 500
nx = 106
ny = 76

30.1.3.2. Periods

The possible period units are:

  • hour
  • day
  • week
  • month
  • year

The value specifies the number of those units to keep in the DB. Here is an example for generating a report aggregated by week, for the last 12 weeks:

periods.week = 12

The report generator consider the current time as being the time of the last hit matching the filters. A period is taken into account even if there is no record for this period.

30.1.3.3. Values

A list, separated by ',' of any of the following:

  • count - the number of hits (no PDF or DXF outputs)
  • countPDF - the number of PDF generated
  • countDXF - the number of DXF generated
  • pixel - the sum of pixels generated

30.1.3.4. Dimensions

A list, separted by ',' of any of the following:

  • project
  • user
  • scale
  • size
  • theme
  • layer
  • pdfFormat
  • pdfRes

If you use the scale dimension, you have to add a field to specifiy the limits for the discretization of the scale value. For example:

dimensions = scale
scales=1000,5000,10000,50000,100000,500000,1000000,5000000

30.1.3.5. Filters

You can put as many filters as you want, one line per filter. The filters available are (you must prepend filters.):

  • scale - a range of scales (floating point)
  • width - a range of map width
  • height - a range of map height
  • project - a list, separated by ',' of project names
  • layer - a list, separated by ',' of layer names
  • theme - a list, separated by ',' of theme names
  • user - a list, separated by ',' of user names
  • pdfFormat - a list, separated by ',' of PDF format names
  • pdfRes - a range of PDF resolution
  • ip - a list of IP

For list of names, you can use '*' for matching any string of character. For example, cn* will match cn25 and cn50. For ranges of values, you can give either a range like 2-5 or a simple value.

List of IP addresses are list of IP criteria, separated by a coma. An IP criteria can have several forms:

  • 192.168.1.34 - match only the given IP address
  • 192.168.1.0/24 - match every IP address in the range from 192.168.1.0 to 192.168.1.255
  • 192.168.1.0/255.255.255.0 - match every IP address in the range from 192.168.1.0 to 192.168.1.255

If an IP criteria is preceded by a ! sign, every record with IP address matched will not be taken. The criteria are checked in order and the first one matching will determine the result of the filter. If no criteria is matched, the record is ignored. So, for example:

192.168.1.10, !192.168.1.0/255.255.255.0, 0.0.0.0/0
      
  • 192.168.1.1 - is ignored
  • 192.168.1.10 - is taken
  • 114.12.27.10 - is taken

Some examples of filters:

filters.scale = 1000.5-5000
filters.width = 290-350
filters.height = 290-350
filters.project = sitn, agri
filters.layer = cn*, adresses
filters.theme = orthophotos, default
filters.user = Jules, Jean
filters.pdfFormat = A4, A5
filters.pdfRes = 290-350
filters.ip = 192.168.1.10, !192.168.1.0/255.255.255.0, 0.0.0.0/0

30.1.4. Running Reports Generation

Raw data imported with Import application are aggregated using the following command (remove end-of-line backslashes on Windows):

java -Xmx1G -cp target/stats-standalone.jar org.cartoweb.stats.report.Reports \
     --iniFilename=myReports.ini --tableName=stats \
     --db="jdbc:postgresql://localhost/MYDB?user=MYUSER&password=MYPASSWORD" \
     --purgeOnConfigurationChange

where:

  • -Xmx1G - tells Java to use a maximum of 1 Gb for the aggregation. This should be enough even for very large amount of data
  • --iniFilename - name of INI configuration file
  • --tableName - table prefix. Allows to use different table prefixes in order to have several environments in only one database. For instance, use this option if you want to import both CartoWeb and WMS logs in the same database
  • --db - connection string
  • --purgeOnConfigurationChange - tells the application to drop all aggregated data when a change was found in a report configuration. Without this option, a warning is displayed and old data is not erased. Warning: you will NOT be able to get report aggregated data again if original raw data has been purged!

30.1.5. Purging the Data

Raw report data is purged using the following command (remove end-of-line backslashes on Windows):

java -Xmx1G -cp target/stats-standalone.jar org.cartoweb.stats.purge.Purge \
     --tableName=stats --nbDays=100 \
     --db="jdbc:postgresql://localhost/MYDB?user=MYUSER&password=MYPASSWORD"

where:

  • -Xmx1G - tells Java to use a maximum of 1 Gb for the aggregation. This should be enough even for very large amount of data
  • --tableName - table prefix. Allows to use different table prefixes in order to have several environments in only one database. For instance, use this option if you want to import both CartoWeb and WMS logs in the same database
  • --nbDays - Sets the number of days to keep in the raw data
  • --db - connection string

30.2. Statistics Visualization

StatsReports visualization plugin is a standard CartoWeb client and server plugin. It must be activated on both sides.

30.2.1. Client-side Configuration

Typical client-side statsReports.ini looks like:

datas.cartoweb.label = Statistiques Cartoweb
datas.cartoweb.dsn = pgsql://MYUSER:MYPASSWORD@localhost/MYDB
datas.cartoweb.prefix = stats

datas.wms.label = Statistiques WMS
datas.wms.dsn = pgsql://MYUSER:MYPASSWORD@localhost/MYDB
datas.wms.prefix = statswms

tempDsn = pgsql://MYUSER:MYPASSWORD@localhost/MYTEMPDB
nColors = 16

where:

  • datas.*.label - environment label (for the GUI)
  • datas.*.dsn - database connection
  • datas.*.prefix - table prefix (to use more than one environment on one database)
  • tempDsn - database connection for temporary tables (could be the same as other database connection)
  • nColors - number of colors for map statistics (linear distribution)

Please note that there should be one session at a time for each CartoWeb user. This is due to the cache management (one graph/map per user).

Additional parameters are available to configure the cvs export of the tabular data:

outputCsv = 1
csvShowHeaders = 1
csvSeparator = ";"
filename = statsReport_cursomfilename.csv
csvUseTextDelimiter = 1
csvTextDelimiter = #
  • outputCsv - set to 1 to enable csv export link, 0 to disable. Default is 1.
  • csvShowHeaders - set to 1 to display line and column headers in the csv, 0 to hide. Default is 1.
  • csvSeparator - define the character to use as separator in the csv. Default is "," (comma).
  • filename - define a custom defined filename. It may be static or contain a generation date under various formats. Date formating is performed by indicating between a couple of brackets the keyword date, followed by a comma and PHP date()-like date format. (see http://php.net/date). For example export_[date,Ymd-Hi].csv which gives for instance export_20060725-2021.csv. Default is cartoweb_statsReport.csv.
  • csvTextDelimiter - tells what character should be used to delimit the text in each cell. It is specially useful when the character used as the csvSeparator may be found within the cell content. Default parameter value is double-quote ie. `"`.
  • csvUseTextDelimiter - set to 1 enable text delimiter usage, 0 to disable. Default is 0.

30.2.2. Server-side Configuration

Typical server-side statsReports.ini looks like:

layer = my_stats_layer

where:

  • layer - name of Mapfile statistics layer (see below). Default value is stats

30.2.3. Related Elements in Mapfile

In order to display map statistics, the following layer must be added to the mapfile. Name of layer must be the one set in INI file.

LAYER
  NAME "my_stats_layer"
  TYPE RASTER
  DATA ""
  TRANSPARENCY 80
  STATUS ON
END

30.2.4. Related Elements in layers.ini

In order to display map statistics of type gridbbox or griddenter, you must also add the "my_stats_layer" in the layers.ini (server side).

valid xhtml 1.0 valid css