Table of Content

  • Introduction
  • Logical reasoning
    • Typical project phases
    • Local time
    • Sending to client
  • Manifesto :)
    • Facts
    • Rules
  • Investigations
    • Introduction
    • Create and populate table
    • Populate other columns with conversion
    • Find storage requirements for above data
    • Results
  • Links

Introduction

Many software projects have Oracle database on its back end. And most of projects store in database some timing information. These may be timestamps, with second and even millisecond precision. Or just dates. Examples are time of document created and someone's birthday.

Logical reasoning

Typical project phases

Phase 1 - project start. Database created fast and dirty, to get immediate results and applause. Any time is TIMESTAMP, current time is CURRENT_TIMESTAMP, any day is again TIMESTAMP with 0:00:00.000000 time. In best scenario day is DATE with missing time.

Phase 2 - project grows, dirty and temporary data became fixed and production ready (of course, too late to change something).

Phase 3 - real data and real millions of records come, and everything becomes SLOW and out-of-memory.

Local time

One more disaster - storing localized time in database. Oracle's CURRENT_TIMESTAMP generates time in current session time zone. In most cases timing data are extracted from database to be delivered to a client. And in most cases - to several clients. And often these clients are in different time zones. It does mean that time must be sent in universal time zone - of course in UTC. If it is served to client in UTC, why it should be stored in some other specific time zone ?

Sending to client

"... to be delivered to a client" = to be SENT to a client, by some network protocol. In some format. Serialized or stringified. Stringified form: "07/05/2017" is 7th of May or 5th of Jule ? JSON - has no date time representation. JavaScript has.

Manifesto :)

Facts

  • FACT - Databases do not provide simple and efficient date and time handling
  • FACT - Transaction protocols do not provide simple and efficient date and time handling
  • FACT - Most of projects suffer from database bottleneck

Rules

  • RULE - Database, server side processing and network transactions - process date/time only in UTC
  • RULE - moments of time are presented as milliseconds from epoch in UTC - 8 bytes/64 bits numeric - long in Java - one word on x64 platform;
  • RULE - days are presented as days from epoch in UTC - 2 bytes - short in Java;
  • RULE - time (milliseconds) is stored as NUMBER(13) in Oracle, or BIGINT in other databases;
  • RULE - a day is stored as NUMBER(5) in Oracle, or SMALLINT in other databases;
  • RULE - milliseconds and days are sent over network as above explained numbers - longs and shorts - easy serialized in the most portable way;
  • RULE - milliseconds and days are converted to a user readable strings only on client side (normally in JS) - when local zone, i18n, l10n are known.

Investigations

Introduction

How to keep timing data in database. Let's investigate, how much space it takes in database. In Oracle database.

Create and populate table

Table contains columns of frequently used data types - TIMESTAMP and DATE. And the same information is stored in numeric columns.
create table TIMING_SIZE (
  MS13 NUMBER(13),  -- milliseconds from epoch - exactly 13 digits
  MS NUMBER,        -- milliseconds from epoch - lazy version
  DAY5 NUMBER(5),   -- day from epoch - exactly 5 digits
  DAYL NUMBER,      -- day from epoch - lazy version
  TS TIMESTAMP,     -- this is normally how it is designed - nanoseconds which are never used
  TS3 TIMESTAMP(3), -- should be stored with millisecond precision
  DAYX DATE         -- this is normally how it is designed
);
insert into TIMING_SIZE (TS, TS3, DAYX) values (CURRENT_TIMESTAMP, CURRENT_TIMESTAMP, CURRENT_DATE);

Populate other columns with conversion

Please note, how complex and cumbersome is to get universally used "milliseconds after epoch" in Oracle environment.
update TIMING_SIZE set DAY5 = extract(day from (TS - TIMESTAMP '1970-01-01 00:00:00.000'));

update TIMING_SIZE set DAYL = DAY5,
  MS13 = DAY5 * 24 * 3600000 +
          extract(hour from (TS - TIMESTAMP '1970-01-01 00:00:00.000')) * 3600000 +
          extract(minute from (TS - TIMESTAMP '1970-01-01 00:00:00.000')) * 60000 +
          extract(second from (TS - TIMESTAMP '1970-01-01 00:00:00.000')) * 1000;

update TIMING_SIZE set MS = MS13;
select * from TIMING_SIZE;
After all columns populated, we can see:
MS13          MS            DAY5  DAYL  TS                              TS3                             DAYX                        
1519145687690 1519145687690 17582 17582 20-FEB-18 04.54.47.690000000 PM 20-FEB-18 04.54.47.690000000 PM 20-FEB-18                   

Find storage requirements for above data

select vsize(MS13), vsize(MS), vsize(DAY5), vsize(DAYL), vsize(TS), vsize(TS3), vsize(DAYX)
from TIMING_SIZE;
VSIZE(MS13) VSIZE(MS) VSIZE(DAY5) VSIZE(DAYL) VSIZE(TS) VSIZE(TS3) VSIZE(DAYX)                 
8           8         4           4           11        11         7                           

Results

For Oracle database, which has all numeric and character types variable length:
  • In all cases simple numeric representation of time information take less time than standard Oracle types;
  • Setting size of numeric columns does not affect storage requirements;
  • For timestamps best to use MS NUMBER, which takes 8 bytes;
  • For days best to use DAYL NUMBER, which takes 4 bytes.
  • For
For PostgreSQL, SQL Server and MySQL: SMALLINT and BIGINT are optimal.

Taking into account other systems

Oracle by default consider column with type NUMBER having maximum precision, that does not affect used space. It is fine for us just to read NUMBER column into long. But some tools like JOOQ expects the column to contain maximum values and generate appropriate code. So any column of type NUMBER will be read into Java's BigDecimal, which is very expensive object. To avoid this, use NUMBER(5) to store days and NUMBER(13) to store milliseconds.

Links

Stack Overflow: Memory consumption java.util.Date vs java.time.LocalDate vs org.joda.time.DateTime
Stack Overflow: How many bytes of memory does a java.util.Date object use?
Stack Overflow: The “right” JSON date format
Stack Overflow: oracle systimestamp (sysdate) to milliseconds
Oracle Database SQL Reference: Datatypes
Oracle Database SQL Reference: CURRENT_TIMESTAMP
Oracle Dates, Timestamps and Intervals
Oracle Timestamp Elapsed Math

Home  
Terms and Conditions (c) 2005, 2006, 2008, 2009, 2011, 2012 NAN