AnyLogic
Expand
Font size

Data set

Data Set is capable of storing 2D (X,Y) data of type double and maintaining the up-to-date minimum and maximum of the stored data for each dimension. The data set keeps a limited number of the latest data items.

You can use time as X-values of the data set, i.e. associate observable value with a time moment when it is sampled. A queue length, a vehicle coordinate, a fluid level are the examples of such values. Such data set is called timed.

Or, you can record the dependency of one value on another — such data sets are called phased.

Demo model: Dataset for Cost Data Open the model page in AnyLogic Cloud. There you can run the model or download it (by clicking Model source files).
Adding a new item when the dataset is full will cause loss of the oldest sample and, if the lost item contained minimum or maximum, will initiate a new search for min/max, which may be quite time consuming for large datasets.
Therefore for large datasets it is recommended to have the size not less than the number of items you plan to add.

You can set the data set to write data into the model execution log — datasets_log. To do this, select the data set’s option Log to database, and enable the model to write to the log as described in Model execution logs.

While running the model, you can view the collected data and optionally copy it to the clipboard so that later on you can paste it to some other application (e.g. to Excel) to perform statistical analysis.

To create a data set

  1. Drag the  Data Set element from the  Analysis palette into the graphical editor.
  2. Go to the Properties view.
  3. Select the Use time as horizontal axis value check box to make data set timed. Specify the expression dynamically evaluated to obtain the current data set value in the Vertical axis value box.
  4. Otherwise, to make the data set phased, clear this check box and specify expressions evaluating both Horizontal axis value and Vertical axis value.
  5. Data set may have a limitation on a maximum number of points it can store. Enter the tail size in the Keep up to ... latest samples box.
  6. Finally, choose, how you want this data set to be updated.

Properties

General

Name — The name of the data set. The name is used to identify and access the data set.

Ignore — If selected, the data set is excluded from the model.

Visible — If selected, the data set is visible on a presentation at runtime.

Show name — If selected, the name of the data set is displayed on a presentation diagram.

Use time as horizontal axis value — If selected, the data set is timed, i.e. when new samples are added to the data set, Y-value is evaluated using the specified Vertical axis value expression, while X-value takes the current model time value.
Otherwise, the data set is phased, i.e. both X- and Y- values of the data set are evaluated using the specified expressions (Horizontal axis value and Vertical axis value correspondingly).

Horizontal axis value — [Enabled if Use time as horizontal axis value is not selected] Expression that will be dynamically evaluated to obtain the current X-value of the phased data set.

Vertical axis value — Expression, that will be dynamically evaluated to obtain the current Y-value of the data set.

Keep up to ... latest samples — A “tail size” of the data set. Defines a number of the latest data items this data set will keep.

Do not update data automatically — If selected, data set is not updated automatically. In this case you should add new samples by yourself as described in Updating analysis data objects.

Update data automatically — If selected, new data samples are added automatically starting at the exact time (select the option Use model time) or date (select Use calendar dates) specified below and repeating with the specified Recurrence time.

Log to database — If selected, data collected by this data set element will be added into the model execution log — datasets_log (if logging is turned on in the model’s Database properties).

Functions

Populating the data set
Function Description
void add(double x, double y) Adds a new data item to the data set.
void add(double y) Adds a new value to the data set. This function is supported only by datasets with the option Use time (run number) as horizontal axis value selected. Calling this function for other datasets will throw runtime error.
void fillFrom(DataSet ds) Makes this dataset an exact copy of the given original dataset.
void fillFrom(TableFunction tf) Discards all existing data, sets the capacity equal to the number of entries in the given table function and fills the dataset from the given table function.
Retrieving data
Function Description
double getXMin() Returns the minimum of all x values of all stored data items, or infinity in case there are no items.
double getXMax() Returns the maximum of all x values of all stored items, or -infinity in case there are no items.
double getYMax() Returns the maximum of all y values of all stored items, or -infinity in case there are no items.
double getYMin() Returns the minimum of all y values of all stored items, or infinity in case there are no items.
double getX(int i) Returns the x of the data items with a given index (which must be in the range 0..size()-1).
double getY(int i) Returns the y of the data items with a given index (which must be in the range 0..size()-1).
int size() Returns the number of items stored in the data set.
Resetting data
Function Description
void reset() Discards all stored data and their minimum/maximum.
Capacity
Function Description
int getCapacity() Returns the capacity of the data set.
void setCapacity(int newcapacity) Resizes the data set according to the new capacity.
Duplicating values
Function Description
void allowDuplicateX(boolean yes) Sets the way of handling two subsequent calls of add() with identical X values.

yes — if true, two entries in the dataset will be created, otherwise second item will override the last one.
void allowDuplicateY(boolean yes) Sets the way of handling two subsequent calls of add() with identical Y values.

yes — if true, two entries in the dataset will be created, otherwise second item will override the last one.
boolean duplicateXAllowed() Tests if subsequent data items with same X values are allowed in this dataset. Returns true if yes, otherwise false.
boolean duplicateYAllowed() Tests if subsequent data items with same Y values are allowed in this dataset. Returns true if yes, otherwise false.
Textual representation
Function Description
String toString() Returns a tab-separated multi-line textual representation of the data set containing not more than 1000 data items.
How can we improve this article?