User Tools

Site Tools


reports:egreports:data_types

This is an old revision of the document!


Data Types, Transforms, and Operators

Data Types

The term “data type” means exactly what it appears to mean: when a piece of data is stored, its data type indicates what kind of information it is. The data type tells the system how to treat the piece of information and determines how that information can be used or viewed.

For example, when Evergreen needs to store a time, such as a checkout time for a circulation, it will store it as a timestamp. “Timestamp” is a specific data type that tells the system to treat that piece of data as a specific time and indicates what options are available for interpretation of that data. The following listing covers the data types used by Evergreen.

bool: Bool is short for “boolean”; this data type denotes that the field contains either “true” or “false.”

int: Short for “integer,” this data type denotes that the field contains a number. Occasionally, you will see fields that contain monetary amounts with this data type, however, usually int indicates that the field contains a number such as a patron's Claims Returned Count.

interval: Interval fields contain time intervals, such as “2 weeks” or “6 months.” The recurrence interval for fines and the time limit for age hold protection are stored as intervals.

id: id fields are the unique identifiers that the software uses to distinguish database tables from one other; for example, each circulation that is recorded has an id that no other circulation record has. Ids look like numbers, but the id data type is treated specially by the software for determining how tables are linked.

link: The link data type is similar to the id datatype. Links look like numbers if you use them in a report's output and are used by the software to link tables together.

money: Fields with the money data type contain monetary amounts, such as the amount billed to a patron on a billing line item.

org_unit: Short for “organizational unit”, org_unit is a special data type. It can act like a link (for example, the Home Library field for a patron's record links that patron with his or her library branch) but it also lets you choose branches and systems from a list when you use a field of this type as a filter.

text: Fields with the text data type contain plain text: patron names, call number labels, and other strings of text are stored with this data type.

timestamp: Timestamp is a very important data type for reporting; it contains times in a very specific manner that the software can understand. Unless a transform is applied, timestamps aren't particularly friendly to look at; however, many transforms are available to make this data type easier to use for displays and filters.

Transforms

Transforms determine how data is interpreted when it is retrieved from the database. Each data type has a specific set of transforms available; some are more useful to computer systems, some are more useful for humans. The 'raw data' transform is available to every data type; it indicates that the data should be treated exactly as it is stored.

This section discusses the difference between aggregate and non-aggregate transforms and contains a listing of the transforms available in Evergreen.

Aggregate and Non-Aggregate Transforms

Most transforms are non-aggregate; non-aggregate transforms only present the data from a single row. On the other hand, aggregate transforms bundle data from multiple rows together. For example, say you have a report that should present data from the following three rows:

Last Name Claims Returned Count Patron ID
Levine 3 23923
Maddox 1 43613
Smith 5 43547

If you don't care about anything other than the total claims returned count, you can apply the aggregate transform sum to the Claims Returned Count. Since sum is an aggregate transform, the Claims Returned Count from all three rows will be presented in a single row in the output:

Claims Returned Count
9

On the other hand, if you apply a non-aggregate transform, such as raw data, to the Claims Returned Count, each entry will appear on a different row in the output:

Claims Returned Count
3
1
5

Average: The average transform is an aggregate transform. When you apply this transform to a field, the output will contain the average value of the field from every database row that meets the filter criteria. This transform is available for int and money data types.

Count: Count is an aggregate transform; it gives a count of all the fields found; in general, you should use Count Distinct rather than count to avoid unexpected results. Count is available for text, int, id, money, and timestamp data types.

Count Distinct: Count Distinct is an aggregate transform; it gives a count of all the unique fields found; in general, you should use this transform rather than count to avoid unexpected results. Count Distinct is available for text, int, id, money, and timestamp data types.

Raw Data: The raw data transform presents the data exactly as it is stored in the database. Raw data is available for all data types.

First Five Characters: This transform is non-aggregate and returns the first five characters of the text stored in a field. This transform is particularly useful for filtering or sorting ZIP Code fields. First Five Characters is available for the text data type.

First Continuous Non-space String: First Continuous Non-space String is a non-aggregate transform. As the name indicates, it returns the first word (or string of numbers or characters) in a field. First Continuous Non-space String is available for the text data type.

First Value: While not technically an aggregate transform, First Value presents one of many fields. It presents the value that was added to the database first. So, say you have three patrons that were entered in the following order.

Last Name Claims Returned Count Patron ID
Levine 3 23923
Maddox 1 43613
Smith 5 43547

If you apply First Value to the Last Name field, it will return “Levine” since that is the oldest value for that field. First Value is available for the text, int, money, and timestamp data types.

Last Value: Last Value is identical to First Value except that it returns the most recently added value rather than the one that has been in the database for the longest. Last Value is available for the text, int, money, and timestamp data types.

Lower Case: Lower Case presents the text stored in a field as all lower case. So, if a field contains the text “Smith,” the lower case transform would present that field as “smith.” Lower Case is available for the text data type.

Max: Max is another technically non-aggregate transform – it returns the highest value available, rather than the oldest or newest values that first value or last value return. Max is available for text, int, money, and timestamp.

Min: Min is identical to max except that it returns the lowest value available from the database rather than the largest or highest. Min is available for text, int, money, and timestamp.

Substring: Substring is an unusual transform; it only applies for filters, not for display fields. Say you want to get a list of every patron that has the letters “ith” in their last name; you could set up a filter on the Last Name field with substring as the transform and “Equal to” as the operator. Then, you could specify “ith” when you run the report and patrons with names like “Smith” or “Witherson” will be included in the output. Substring is only available for text.

Upper Case: This transform presents the text stored in a field as all upper case; it is good for matching text fields if you don't know which case is used in the database. Upper Case is only available for the text datatype.

Round: Round presents numerical data as the closest integer. It is available for int and money.

Sum: Sum is an aggregate transform that adds up the values of all applicable fields. Be careful not to confuse sum with count or count distinct. Sum is available for int and money.

Age: Age presents a timestamp as the interval between it and the current date. For example, if today's date is November 10, 2008 and a field contains a timestamp that was created on November 3, 2008, age would present that field as “0 mons 7 days 07:02:08.00186.” Age is only available for timestamp data type.

Date: This transform presents a timestamp as a human-readable date. Date is available for timestamp fields.

Day Name: This transform presents a timestamp as the day of the week by its name, such as Monday, Tuesday, Wednesday, etc. Day Name is only available for timestamp fields.

Day of Month: Day of Month presents a timestamp as the numerical day of the month; for example, a timestamp generated for January 19, 2007 would appear as “19” if day of month is applied to it. Day of Month is available for timestamp fields.

Day of Week: Day of Week presents a timestamp as the numerical day of the month. A timestamp generated for a Monday would appear as “1” if day of week is applied to it. Day of Week is available for timestamp fields.

Day of Year: Day of Year presents a timestamp as the numerical day of the year. A timestamp generated for February 2 would appear as “32” if day of year is applied to it. Day of Year is available for timestamp fields.

Hour: Hour presents a timestamp as the hour for which it was generated, for example, a timestamp generated at 2:17 PM would appear as “14.” Hour is available for timestamp fields.

Hour of Day: Hour of day is identical to Hour.

Month Name: Month Name presents a timestamp as the name of the month, such as “January.” Month Name is available for timestamp fields.

Month of Year: Month of Year displays the numerical month; for example, a timestamp for February 15, 2008 appears as “2” when Month of Year has been applied to it. Month of Year is available for timestamp fields.

Months Ago: Months Ago presents the number of months that have passed between the time listed in the field and the time the report is run. Months Ago is available for timestamp fields.

Quarter of Year: Quarter of Year displays the numerical quarter – for example, a timestamp for December 12, 2008 would appear as “4.” Quarter of Year is available for timestamp fields.

Quarters Ago: Like Months Ago, Quarters Ago presents the number of quarters that have passed between the time listed in a field and the time the report is run. Quarters Ago is available for timestamp fields.

Week of Year: Week of Year displays the numerical week of a year (1–52) for which a timestamp is set. Week of Year is available for timestamp fields.

Year: The Year transform displays only the year portion of a timestamp.

Year + Month: Year + Month presents a timestamp as the year and month, for example, July 31, 2005 appears as “2005-7” when this timestamp is applied. Year is available for timestamp fields.

Operators

reports/egreports/data_types.1650994334.txt.gz · Last modified: 2022/04/26 17:32 by smorrison