Working with Columns
INFO
Find here all information about working with columns in Datameer.
INFO
Each workbook can contain a maximum of 702 columns.
Renaming Columns
To rename a column:
- Right-click a column name and select "Rename" or double-click a column name. The column name is marked as editable.
INFO: To cancel column renaming, press the 'Escape' key.
- Enter the new column name.
INFO: Column names must only contain standard, capital oder lower-case characters, numbers and/ or underscores. A column name cannot begin with a number.
INFO: Column names in all workbook sheet types are case-sensitive, e.g. the column names 'Foo', 'fOO' etc. are unique columns within the same worksheet.
- Confirm the entry with "Enter" key. Renaming is finished.
Adding Columns
INFO
A column can only be added next to a column which has content. Added columns appear to the left of the outgoing column.
To add a column right-click a column name and select "Add Column". The new, empty column appears to the left. Adding a column is finished.
Reordering Columns
To remove a column left-click on a column header and drag and drop it at the required location.
INFO: A trailing mark shows the new location of the removed column by showing the column name during drag and drop.
Resizing Columns
To resize and adapt a column width, place the mouse between two column headings so the icon changes to a double-ended arrow and drag the column marker to the desired width.
Splitting Columns
INFO
The split column operation can be updated as many times as you like until you have desired split configuration.
To split a workbook:
- Right-click on a column header and select "Split Column" or click on the "Split Column" icon in the toolbar. The dialog 'Split to columns' opens in the Workbook Inspector.
or - Select the required column.
INFO: The column name from where the right-click was executed is set per default in the field. You can change the column if needed. - Select the split type from the drop-down.
- Enter the delimiter.
INFO: A comma is set as delimiter per default. - Decide if leading or trailing characters shall be skipped.
- Decide if elements shall be skipped and in how many columns the column shall be divided.
- Decide if additional elements shall be either caught in an overflow column or be dropped. Decide also if to Trim Whitespace and how to define empty values by marking the respective checkbox.
- Enter the column prefix for the split column and confirm with "Create Sheet". The split column is displayed in a new sheet.
INFO: The new column name begins with 'Name_' per default.
Encoding Columns
INFO
Column encoding performs ordinal, one-hot or binned encoding on column data which assigns a unique numeric value to each categorical or continuous value. Once applied, column encoding can be updated as needed until the desired results are achieved.
Columns with a high cardinality are not suited for ordinal/ binned encoding.
Column encoding provides a consistent view of prepared data which is especially helpful for teams working together on model building and testing activities.
Ordinal Encoding
For ordinal encoding:
- Right-click on a column header and select "Encoding" or click on the "Encode Column" icon from the toolbar. The 'Encode Column' dialog is displayed on the right.
or - If needed, change the column by entering the required column name in 'Column'.
- Select the encoding type "Ordinal Encoding" from the drop-down. Further selection options adapt to the needs.
- Decide how to deal with unknown values by clicking the required statement.
INFO: 'Drop Value' ignores values beyond the first 100 most frequent. 'Default Value' shows values, which can not be encoded.
- View the top 32 values (by count).
- If needed, add a new value in the blank field, change the order of the top values or delete single values.
- Confirm with "Encode". The encoding result is displayed in a new encoding sheet within the workbook. Ordinal Encoding is finished.
One-hot Encoding
For one-hot encoding:
- Right-click on a column header and select "Encoding" or click on the "Encode Column" icon from the toolbar. The 'Encode Column' dialog is displayed on the right.
or - If needed, change the column by entering the required column name in 'Column'.
- Select the encoding type "1-Hot Encoding" from the drop-down. Further selection options adapt to the needs.
- Decide how to deal with unknown values by clicking the required statement.
INFO: 'Drop Value' ignores values beyond the first 100 most frequent. 'Include at last column' adds a new element to the list that encodes together all values beyond the 100 most frequent. - Select the output format from the dropdown 'Output'.
INFO: 'As List' keeps all binary pairs together in a single column. 'As Column' creates binary pairs each in their own column. - View the top 32 values (by count).
- If needed, add a new value in the blank field, change the order of the top values or delete single values.
- Confirm with "Encode". The encoding result is displayed in a new encoding sheet within the workbook. One-hot encoding is finished.
Binned Encoding
For binned encoding:
- Right-click on a column header and select "Encoding" or click on the "Encode Column" icon from the toolbar. The 'Encode Column' dialog is displayed on the right.
or - If needed, change the column by entering the required column name in 'Column'.
- Select the encoding type "Binned Encoding" from the drop-down. Further selection options adapt to the needs.
- Select the output format from the dropdown 'Output'.
INFO: 'As List' keeps all binary pairs together in a single column. 'Ordinal' encodes as ordinal numbers in a single column.
- View the default value distributions.
INFO: The graph changes according to the amount of dividers.
- Enter the required bin dividers to change the percentile size of the divider.
INFO: There are 3 dividers set as default, e.g. 'Divider 1' contains the 25% of the selected column values.
INFO: To delete a divider, click on "x" next to the required divider.
- If needed, click on "Add new Divider" to add a new divider.
INFO: Clinking the button adds an additional bucket, recalculates the percentile size and the corresponding absolute values.
- Confirm with "Encode". The encoding result is displayed in a new encoding sheet within the workbook. Binned encoding is finished.
Hiding Columns
TIP
To hide or expand a larger number of columns at the same time, use the 'Toggle Columns' option.
To hide a column, right-click on a column header and select "Hide Column". The whole column hides.
Expanding Columns
To expand a hidden column, double-click in the small area where the column was before. The whole column appears.
To expand a long column header, right-click in the column header and select "Expand Column". Only the column header is displayed fully.
INFO
To expand all column headers, right-click in the column header and select "Expand All Columns". Only the column headers are displayed fully.
INFO
Removing Columns
To remove a column right-click on a column name and select "Remove Column". The column is removed.
INFO: Column deletion is not possible in a source sheet.