How To . . .
Interpret Data profiling report
Pandas Profiling ReportDataset statistics
| Number of variables | 10 |
|---|
| Number of observations | 97498 |
|---|
| Missing cells | 0 |
|---|
| Missing cells (%) | 0.0% |
|---|
| Duplicate rows | 0 |
|---|
| Duplicate rows (%) | 0.0% |
|---|
| Total size in memory | 7.4 MiB |
|---|
| Average record size in memory | 80.0 B |
|---|
Variable types
| Categorical | 6 |
|---|
| DateTime | 1 |
|---|
| Numeric | 3 |
|---|
Reproduction
| Analysis started | 2023-07-28 17:31:25.751127 |
|---|
| Analysis finished | 2023-07-28 17:31:41.087460 |
|---|
| Duration | 15.34 seconds |
|---|
| Software version | ydata-profiling vv4.1.2 |
|---|
| Download configuration | config.json |
|---|
ID Ticket
Categorical
HIGH CARDINALITY  UNIFORM  UNIQUE 
| Distinct | 97498 |
|---|
| Distinct (%) | 100.0% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
| GDDENR-5042564453 | 1 |
|---|
| TDLTSR-6543590870 | 1 |
|---|
| TDLTSR-6743525276 | 1 |
|---|
| TDLTSR-6643795007 | 1 |
|---|
| TDLTSR-6643672052 | 1 |
|---|
| Other values (97493) | 97493 |
|---|
Length
| Max length | 17 |
|---|
| Median length | 17 |
|---|
| Mean length | 17 |
|---|
| Min length | 17 |
|---|
Characters and Unicode
| Total characters | 1657466 |
|---|
| Distinct characters | 23 |
|---|
| Distinct categories | 3 ? |
|---|
| Distinct scripts | 2 ? |
|---|
| Distinct blocks | 1 ? |
|---|
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 97498 ? |
|---|
| Unique (%) | 100.0% |
|---|
Sample
| 1st row | GDDENR-5042564453 |
|---|
| 2nd row | GDDENR-8042508060 |
|---|
| 3rd row | GDDESR-1342539995 |
|---|
| 4th row | GDDTSR-5942488006 |
|---|
| 5th row | GDLEER-0042524120 |
|---|
Common Values
| Value | Count | Frequency (%) |
| GDDENR-5042564453 | 1 | < 0.1% |
| TDLTSR-6543590870 | 1 | < 0.1% |
| TDLTSR-6743525276 | 1 | < 0.1% |
| TDLTSR-6643795007 | 1 | < 0.1% |
| TDLTSR-6643672052 | 1 | < 0.1% |
| TDLTSR-6643670597 | 1 | < 0.1% |
| TDLTSR-6543823716 | 1 | < 0.1% |
| TDLTSR-6543775787 | 1 | < 0.1% |
| TDLTSR-6543742069 | 1 | < 0.1% |
| TDLTSR-6543637734 | 1 | < 0.1% |
| Other values (97488) | 97488 | > 99.9% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| gddenr-5042564453 | 1 | < 0.1% |
| gdleer-2342666259 | 1 | < 0.1% |
| gdleer-0142608095 | 1 | < 0.1% |
| gdleer-0242564650 | 1 | < 0.1% |
| gdleer-0542574815 | 1 | < 0.1% |
| gdleer-0842457219 | 1 | < 0.1% |
| gdleer-1242542213 | 1 | < 0.1% |
| gdleer-1342611596 | 1 | < 0.1% |
| gdleer-7342441622 | 1 | < 0.1% |
| gdleer-1442518153 | 1 | < 0.1% |
| Other values (97488) | 97488 | > 99.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 4 | 187129 | 11.3% |
| 3 | 130592 | 7.9% |
| T | 112170 | 6.8% |
| 0 | 97943 | 5.9% |
| - | 97498 | 5.9% |
| 2 | 95170 | 5.7% |
| L | 90048 | 5.4% |
| R | 85244 | 5.1% |
| 1 | 82403 | 5.0% |
| 5 | 78179 | 4.7% |
| Other values (13) | 601090 | 36.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 974980 | 58.8% |
| Uppercase Letter | 584988 | 35.3% |
| Dash Punctuation | 97498 | 5.9% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 112170 | 19.2% |
| L | 90048 | 15.4% |
| R | 85244 | 14.6% |
| S | 60126 | 10.3% |
| E | 53581 | 9.2% |
| H | 35549 | 6.1% |
| D | 29766 | 5.1% |
| N | 29193 | 5.0% |
| G | 29063 | 5.0% |
| K | 27709 | 4.7% |
| Other values (2) | 32539 | 5.6% |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 187129 | 19.2% |
| 3 | 130592 | 13.4% |
| 0 | 97943 | 10.0% |
| 2 | 95170 | 9.8% |
| 1 | 82403 | 8.5% |
| 5 | 78179 | 8.0% |
| 6 | 76785 | 7.9% |
| 7 | 76283 | 7.8% |
| 8 | 75910 | 7.8% |
| 9 | 74586 | 7.7% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 97498 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1072478 | 64.7% |
| Latin | 584988 | 35.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| T | 112170 | 19.2% |
| L | 90048 | 15.4% |
| R | 85244 | 14.6% |
| S | 60126 | 10.3% |
| E | 53581 | 9.2% |
| H | 35549 | 6.1% |
| D | 29766 | 5.1% |
| N | 29193 | 5.0% |
| G | 29063 | 5.0% |
| K | 27709 | 4.7% |
| Other values (2) | 32539 | 5.6% |
Common
| Value | Count | Frequency (%) |
| 4 | 187129 | 17.4% |
| 3 | 130592 | 12.2% |
| 0 | 97943 | 9.1% |
| - | 97498 | 9.1% |
| 2 | 95170 | 8.9% |
| 1 | 82403 | 7.7% |
| 5 | 78179 | 7.3% |
| 6 | 76785 | 7.2% |
| 7 | 76283 | 7.1% |
| 8 | 75910 | 7.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1657466 | 100.0% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4 | 187129 | 11.3% |
| 3 | 130592 | 7.9% |
| T | 112170 | 6.8% |
| 0 | 97943 | 5.9% |
| - | 97498 | 5.9% |
| 2 | 95170 | 5.7% |
| L | 90048 | 5.4% |
| R | 85244 | 5.1% |
| 1 | 82403 | 5.0% |
| 5 | 78179 | 4.7% |
| Other values (13) | 601090 | 36.3% |
| Distinct | 1827 |
|---|
| Distinct (%) | 1.9% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
| Minimum | 2016-01-01 00:00:00 |
|---|
| Maximum | 2020-12-31 00:00:00 |
|---|
Histogram with fixed size bins (bins=50)
| Distinct | 2000 |
|---|
| Distinct (%) | 2.1% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Infinite | 0 |
|---|
| Infinite (%) | 0.0% |
|---|
| Mean | 999.28502 |
|---|
| Minimum | 1 |
|---|
| Maximum | 2000 |
|---|
| Zeros | 0 |
|---|
| Zeros (%) | 0.0% |
|---|
| Negative | 0 |
|---|
| Negative (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
Quantile statistics
| Minimum | 1 |
|---|
| 5-th percentile | 101 |
|---|
| Q1 | 500 |
|---|
| median | 999 |
|---|
| Q3 | 1499 |
|---|
| 95-th percentile | 1901 |
|---|
| Maximum | 2000 |
|---|
| Range | 1999 |
|---|
| Interquartile range (IQR) | 999 |
|---|
Descriptive statistics
| Standard deviation | 577.40151 |
|---|
| Coefficient of variation (CV) | 0.57781464 |
|---|
| Kurtosis | -1.1983956 |
|---|
| Mean | 999.28502 |
|---|
| Median Absolute Deviation (MAD) | 500 |
|---|
| Skewness | 0.0059080822 |
|---|
| Sum | 97428291 |
|---|
| Variance | 333392.51 |
|---|
| Monotonicity | Not monotonic |
|---|
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 754 | 73 | 0.1% |
| 285 | 73 | 0.1% |
| 636 | 71 | 0.1% |
| 1341 | 70 | 0.1% |
| 523 | 69 | 0.1% |
| 79 | 69 | 0.1% |
| 1448 | 68 | 0.1% |
| 442 | 68 | 0.1% |
| 482 | 68 | 0.1% |
| 326 | 68 | 0.1% |
| Other values (1990) | 96801 | 99.3% |
| Value | Count | Frequency (%) |
| 1 | 41 | < 0.1% |
| 2 | 49 | 0.1% |
| 3 | 47 | < 0.1% |
| 4 | 55 | 0.1% |
| 5 | 50 | 0.1% |
| 6 | 49 | 0.1% |
| 7 | 39 | < 0.1% |
| 8 | 60 | 0.1% |
| 9 | 48 | < 0.1% |
| 10 | 48 | < 0.1% |
| Value | Count | Frequency (%) |
| 2000 | 51 | 0.1% |
| 1999 | 40 | < 0.1% |
| 1998 | 46 | < 0.1% |
| 1997 | 52 | 0.1% |
| 1996 | 45 | < 0.1% |
| 1995 | 43 | < 0.1% |
| 1994 | 40 | < 0.1% |
| 1993 | 48 | < 0.1% |
| 1992 | 40 | < 0.1% |
| 1991 | 50 | 0.1% |
| Distinct | 50 |
|---|
| Distinct (%) | 0.1% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Infinite | 0 |
|---|
| Infinite (%) | 0.0% |
|---|
| Mean | 25.468328 |
|---|
| Minimum | 1 |
|---|
| Maximum | 50 |
|---|
| Zeros | 0 |
|---|
| Zeros (%) | 0.0% |
|---|
| Negative | 0 |
|---|
| Negative (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
Quantile statistics
| Minimum | 1 |
|---|
| 5-th percentile | 3 |
|---|
| Q1 | 13 |
|---|
| median | 26 |
|---|
| Q3 | 38 |
|---|
| 95-th percentile | 48 |
|---|
| Maximum | 50 |
|---|
| Range | 49 |
|---|
| Interquartile range (IQR) | 25 |
|---|
Descriptive statistics
| Standard deviation | 14.449695 |
|---|
| Coefficient of variation (CV) | 0.56735941 |
|---|
| Kurtosis | -1.2023269 |
|---|
| Mean | 25.468328 |
|---|
| Median Absolute Deviation (MAD) | 13 |
|---|
| Skewness | -0.0014041438 |
|---|
| Sum | 2483111 |
|---|
| Variance | 208.79369 |
|---|
| Monotonicity | Not monotonic |
|---|
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 48 | 2027 | 2.1% |
| 39 | 2026 | 2.1% |
| 3 | 2021 | 2.1% |
| 35 | 2007 | 2.1% |
| 24 | 2003 | 2.1% |
| 5 | 2000 | 2.1% |
| 15 | 1991 | 2.0% |
| 4 | 1988 | 2.0% |
| 31 | 1987 | 2.0% |
| 19 | 1984 | 2.0% |
| Other values (40) | 77464 | 79.5% |
| Value | Count | Frequency (%) |
| 1 | 1969 | 2.0% |
| 2 | 1968 | 2.0% |
| 3 | 2021 | 2.1% |
| 4 | 1988 | 2.0% |
| 5 | 2000 | 2.1% |
| 6 | 1949 | 2.0% |
| 7 | 1935 | 2.0% |
| 8 | 1960 | 2.0% |
| 9 | 1949 | 2.0% |
| 10 | 1974 | 2.0% |
| Value | Count | Frequency (%) |
| 50 | 1949 | 2.0% |
| 49 | 1890 | 1.9% |
| 48 | 2027 | 2.1% |
| 47 | 1933 | 2.0% |
| 46 | 1950 | 2.0% |
| 45 | 1929 | 2.0% |
| 44 | 1943 | 2.0% |
| 43 | 1897 | 1.9% |
| 42 | 1945 | 2.0% |
| 41 | 1966 | 2.0% |
| Distinct | 4 |
|---|
| Distinct (%) | < 0.1% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
| System | 39002 |
|---|
| Login Access | 29193 |
|---|
| Software | 19570 |
|---|
| Hardware | 9733 |
|---|
Length
| Max length | 12 |
|---|
| Median length | 8 |
|---|
| Mean length | 8.3976287 |
|---|
| Min length | 6 |
|---|
Characters and Unicode
| Total characters | 818752 |
|---|
| Distinct characters | 20 |
|---|
| Distinct categories | 3 ? |
|---|
| Distinct scripts | 2 ? |
|---|
| Distinct blocks | 1 ? |
|---|
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Sample
| 1st row | Login Access |
|---|
| 2nd row | Login Access |
|---|
| 3rd row | System |
|---|
| 4th row | System |
|---|
| 5th row | Software |
|---|
Common Values
| Value | Count | Frequency (%) |
| System | 39002 | 40.0% |
| Login Access | 29193 | 29.9% |
| Software | 19570 | 20.1% |
| Hardware | 9733 | 10.0% |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| system | 39002 | 30.8% |
| login | 29193 | 23.0% |
| access | 29193 | 23.0% |
| software | 19570 | 15.4% |
| hardware | 9733 | 7.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 97498 | 11.9% |
| s | 97388 | 11.9% |
| S | 58572 | 7.2% |
| t | 58572 | 7.2% |
| c | 58386 | 7.1% |
| o | 48763 | 6.0% |
| r | 39036 | 4.8% |
| a | 39036 | 4.8% |
| m | 39002 | 4.8% |
| y | 39002 | 4.8% |
| Other values (10) | 243497 | 29.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 662868 | 81.0% |
| Uppercase Letter | 126691 | 15.5% |
| Space Separator | 29193 | 3.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 97498 | 14.7% |
| s | 97388 | 14.7% |
| t | 58572 | 8.8% |
| c | 58386 | 8.8% |
| o | 48763 | 7.4% |
| r | 39036 | 5.9% |
| a | 39036 | 5.9% |
| m | 39002 | 5.9% |
| y | 39002 | 5.9% |
| w | 29303 | 4.4% |
| Other values (5) | 116882 | 17.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 58572 | 46.2% |
| A | 29193 | 23.0% |
| L | 29193 | 23.0% |
| H | 9733 | 7.7% |
Space Separator
| Value | Count | Frequency (%) |
| 29193 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 789559 | 96.4% |
| Common | 29193 | 3.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 97498 | 12.3% |
| s | 97388 | 12.3% |
| S | 58572 | 7.4% |
| t | 58572 | 7.4% |
| c | 58386 | 7.4% |
| o | 48763 | 6.2% |
| r | 39036 | 4.9% |
| a | 39036 | 4.9% |
| m | 39002 | 4.9% |
| y | 39002 | 4.9% |
| Other values (9) | 214304 | 27.1% |
Common
| Value | Count | Frequency (%) |
| 29193 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 818752 | 100.0% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 97498 | 11.9% |
| s | 97388 | 11.9% |
| S | 58572 | 7.2% |
| t | 58572 | 7.2% |
| c | 58386 | 7.1% |
| o | 48763 | 6.0% |
| r | 39036 | 4.8% |
| a | 39036 | 4.8% |
| m | 39002 | 4.8% |
| y | 39002 | 4.8% |
| Other values (10) | 243497 | 29.7% |
| Distinct | 2 |
|---|
| Distinct (%) | < 0.1% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
| IT Request | 73220 |
|---|
| IT Error | 24278 |
|---|
Length
| Max length | 10 |
|---|
| Median length | 10 |
|---|
| Mean length | 9.5019795 |
|---|
| Min length | 8 |
|---|
Characters and Unicode
| Total characters | 926424 |
|---|
| Distinct characters | 12 |
|---|
| Distinct categories | 3 ? |
|---|
| Distinct scripts | 2 ? |
|---|
| Distinct blocks | 1 ? |
|---|
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Sample
| 1st row | IT Error |
|---|
| 2nd row | IT Error |
|---|
| 3rd row | IT Error |
|---|
| 4th row | IT Request |
|---|
| 5th row | IT Error |
|---|
Common Values
| Value | Count | Frequency (%) |
| IT Request | 73220 | 75.1% |
| IT Error | 24278 | 24.9% |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| it | 97498 | 50.0% |
| request | 73220 | 37.5% |
| error | 24278 | 12.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 146440 | 15.8% |
| I | 97498 | 10.5% |
| T | 97498 | 10.5% |
| 97498 | 10.5% |
| R | 73220 | 7.9% |
| q | 73220 | 7.9% |
| u | 73220 | 7.9% |
| s | 73220 | 7.9% |
| t | 73220 | 7.9% |
| r | 72834 | 7.9% |
| Other values (2) | 48556 | 5.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 536432 | 57.9% |
| Uppercase Letter | 292494 | 31.6% |
| Space Separator | 97498 | 10.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 146440 | 27.3% |
| q | 73220 | 13.6% |
| u | 73220 | 13.6% |
| s | 73220 | 13.6% |
| t | 73220 | 13.6% |
| r | 72834 | 13.6% |
| o | 24278 | 4.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 97498 | 33.3% |
| T | 97498 | 33.3% |
| R | 73220 | 25.0% |
| E | 24278 | 8.3% |
Space Separator
| Value | Count | Frequency (%) |
| 97498 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 828926 | 89.5% |
| Common | 97498 | 10.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 146440 | 17.7% |
| I | 97498 | 11.8% |
| T | 97498 | 11.8% |
| R | 73220 | 8.8% |
| q | 73220 | 8.8% |
| u | 73220 | 8.8% |
| s | 73220 | 8.8% |
| t | 73220 | 8.8% |
| r | 72834 | 8.8% |
| E | 24278 | 2.9% |
Common
| Value | Count | Frequency (%) |
| 97498 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 926424 | 100.0% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 146440 | 15.8% |
| I | 97498 | 10.5% |
| T | 97498 | 10.5% |
| 97498 | 10.5% |
| R | 73220 | 7.9% |
| q | 73220 | 7.9% |
| u | 73220 | 7.9% |
| s | 73220 | 7.9% |
| t | 73220 | 7.9% |
| r | 72834 | 7.9% |
| Other values (2) | 48556 | 5.2% |
| Distinct | 5 |
|---|
| Distinct (%) | < 0.1% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
| 2 - Normal | 88656 |
|---|
| 3 - Mayor | 4836 |
|---|
| 1 - Minor | 2258 |
|---|
| 4 - Urgent | 1392 |
|---|
| 0 - Unclasified | 356 |
|---|
Length
| Max length | 15 |
|---|
| Median length | 10 |
|---|
| Mean length | 9.9454963 |
|---|
| Min length | 9 |
|---|
Characters and Unicode
| Total characters | 969666 |
|---|
| Distinct characters | 25 |
|---|
| Distinct categories | 5 ? |
|---|
| Distinct scripts | 2 ? |
|---|
| Distinct blocks | 1 ? |
|---|
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Sample
| 1st row | 0 - Unclasified |
|---|
| 2nd row | 0 - Unclasified |
|---|
| 3rd row | 0 - Unclasified |
|---|
| 4th row | 0 - Unclasified |
|---|
| 5th row | 2 - Normal |
|---|
Common Values
| Value | Count | Frequency (%) |
| 2 - Normal | 88656 | 90.9% |
| 3 - Mayor | 4836 | 5.0% |
| 1 - Minor | 2258 | 2.3% |
| 4 - Urgent | 1392 | 1.4% |
| 0 - Unclasified | 356 | 0.4% |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| 97498 | 33.3% |
| 2 | 88656 | 30.3% |
| normal | 88656 | 30.3% |
| 3 | 4836 | 1.7% |
| mayor | 4836 | 1.7% |
| 1 | 2258 | 0.8% |
| minor | 2258 | 0.8% |
| 4 | 1392 | 0.5% |
| urgent | 1392 | 0.5% |
| 0 | 356 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 194996 | 20.1% |
| - | 97498 | 10.1% |
| r | 97142 | 10.0% |
| o | 95750 | 9.9% |
| a | 93848 | 9.7% |
| l | 89012 | 9.2% |
| 2 | 88656 | 9.1% |
| N | 88656 | 9.1% |
| m | 88656 | 9.1% |
| M | 7094 | 0.7% |
| Other values (15) | 28358 | 2.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 482176 | 49.7% |
| Space Separator | 194996 | 20.1% |
| Dash Punctuation | 97498 | 10.1% |
| Decimal Number | 97498 | 10.1% |
| Uppercase Letter | 97498 | 10.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 97142 | 20.1% |
| o | 95750 | 19.9% |
| a | 93848 | 19.5% |
| l | 89012 | 18.5% |
| m | 88656 | 18.4% |
| y | 4836 | 1.0% |
| n | 4006 | 0.8% |
| i | 2970 | 0.6% |
| e | 1748 | 0.4% |
| g | 1392 | 0.3% |
| Other values (5) | 2816 | 0.6% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 88656 | 90.9% |
| 3 | 4836 | 5.0% |
| 1 | 2258 | 2.3% |
| 4 | 1392 | 1.4% |
| 0 | 356 | 0.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 88656 | 90.9% |
| M | 7094 | 7.3% |
| U | 1748 | 1.8% |
Space Separator
| Value | Count | Frequency (%) |
| 194996 | 100.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 97498 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 579674 | 59.8% |
| Common | 389992 | 40.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 97142 | 16.8% |
| o | 95750 | 16.5% |
| a | 93848 | 16.2% |
| l | 89012 | 15.4% |
| N | 88656 | 15.3% |
| m | 88656 | 15.3% |
| M | 7094 | 1.2% |
| y | 4836 | 0.8% |
| n | 4006 | 0.7% |
| i | 2970 | 0.5% |
| Other values (8) | 7704 | 1.3% |
Common
| Value | Count | Frequency (%) |
| 194996 | 50.0% |
| - | 97498 | 25.0% |
| 2 | 88656 | 22.7% |
| 3 | 4836 | 1.2% |
| 1 | 2258 | 0.6% |
| 4 | 1392 | 0.4% |
| 0 | 356 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 969666 | 100.0% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 194996 | 20.1% |
| - | 97498 | 10.1% |
| r | 97142 | 10.0% |
| o | 95750 | 9.9% |
| a | 93848 | 9.7% |
| l | 89012 | 9.2% |
| 2 | 88656 | 9.1% |
| N | 88656 | 9.1% |
| m | 88656 | 9.1% |
| M | 7094 | 0.7% |
| Other values (15) | 28358 | 2.9% |
| Distinct | 4 |
|---|
| Distinct (%) | < 0.1% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
| 3 - High | 35549 |
|---|
| 0 - Unassiged | 29410 |
|---|
| 1 - Low | 16694 |
|---|
| 2 - Mid | 15845 |
|---|
Length
| Max length | 13 |
|---|
| Median length | 8 |
|---|
| Mean length | 9.1744959 |
|---|
| Min length | 7 |
|---|
Characters and Unicode
| Total characters | 894495 |
|---|
| Distinct characters | 20 |
|---|
| Distinct categories | 5 ? |
|---|
| Distinct scripts | 2 ? |
|---|
| Distinct blocks | 1 ? |
|---|
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Sample
| 1st row | 0 - Unassiged |
|---|
| 2nd row | 0 - Unassiged |
|---|
| 3rd row | 0 - Unassiged |
|---|
| 4th row | 0 - Unassiged |
|---|
| 5th row | 0 - Unassiged |
|---|
Common Values
| Value | Count | Frequency (%) |
| 3 - High | 35549 | 36.5% |
| 0 - Unassiged | 29410 | 30.2% |
| 1 - Low | 16694 | 17.1% |
| 2 - Mid | 15845 | 16.3% |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| 97498 | 33.3% |
| 3 | 35549 | 12.2% |
| high | 35549 | 12.2% |
| 0 | 29410 | 10.1% |
| unassiged | 29410 | 10.1% |
| 1 | 16694 | 5.7% |
| low | 16694 | 5.7% |
| 2 | 15845 | 5.4% |
| mid | 15845 | 5.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 194996 | 21.8% |
| - | 97498 | 10.9% |
| i | 80804 | 9.0% |
| g | 64959 | 7.3% |
| s | 58820 | 6.6% |
| d | 45255 | 5.1% |
| 3 | 35549 | 4.0% |
| H | 35549 | 4.0% |
| h | 35549 | 4.0% |
| e | 29410 | 3.3% |
| Other values (10) | 216106 | 24.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 407005 | 45.5% |
| Space Separator | 194996 | 21.8% |
| Dash Punctuation | 97498 | 10.9% |
| Decimal Number | 97498 | 10.9% |
| Uppercase Letter | 97498 | 10.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 80804 | 19.9% |
| g | 64959 | 16.0% |
| s | 58820 | 14.5% |
| d | 45255 | 11.1% |
| h | 35549 | 8.7% |
| e | 29410 | 7.2% |
| a | 29410 | 7.2% |
| n | 29410 | 7.2% |
| o | 16694 | 4.1% |
| w | 16694 | 4.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 35549 | 36.5% |
| 0 | 29410 | 30.2% |
| 1 | 16694 | 17.1% |
| 2 | 15845 | 16.3% |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 35549 | 36.5% |
| U | 29410 | 30.2% |
| L | 16694 | 17.1% |
| M | 15845 | 16.3% |
Space Separator
| Value | Count | Frequency (%) |
| 194996 | 100.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 97498 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 504503 | 56.4% |
| Common | 389992 | 43.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 80804 | 16.0% |
| g | 64959 | 12.9% |
| s | 58820 | 11.7% |
| d | 45255 | 9.0% |
| H | 35549 | 7.0% |
| h | 35549 | 7.0% |
| e | 29410 | 5.8% |
| a | 29410 | 5.8% |
| n | 29410 | 5.8% |
| U | 29410 | 5.8% |
| Other values (4) | 65927 | 13.1% |
Common
| Value | Count | Frequency (%) |
| 194996 | 50.0% |
| - | 97498 | 25.0% |
| 3 | 35549 | 9.1% |
| 0 | 29410 | 7.5% |
| 1 | 16694 | 4.3% |
| 2 | 15845 | 4.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 894495 | 100.0% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 194996 | 21.8% |
| - | 97498 | 10.9% |
| i | 80804 | 9.0% |
| g | 64959 | 7.3% |
| s | 58820 | 6.6% |
| d | 45255 | 5.1% |
| 3 | 35549 | 4.0% |
| H | 35549 | 4.0% |
| h | 35549 | 4.0% |
| e | 29410 | 3.3% |
| Other values (10) | 216106 | 24.2% |
| Distinct | 22 |
|---|
| Distinct (%) | < 0.1% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Infinite | 0 |
|---|
| Infinite (%) | 0.0% |
|---|
| Mean | 4.5531498 |
|---|
| Minimum | 0 |
|---|
| Maximum | 21 |
|---|
| Zeros | 25071 |
|---|
| Zeros (%) | 25.7% |
|---|
| Negative | 0 |
|---|
| Negative (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
Quantile statistics
| Minimum | 0 |
|---|
| 5-th percentile | 0 |
|---|
| Q1 | 0 |
|---|
| median | 4 |
|---|
| Q3 | 7 |
|---|
| 95-th percentile | 14 |
|---|
| Maximum | 21 |
|---|
| Range | 21 |
|---|
| Interquartile range (IQR) | 7 |
|---|
Descriptive statistics
| Standard deviation | 4.3655179 |
|---|
| Coefficient of variation (CV) | 0.95879074 |
|---|
| Kurtosis | 0.018514197 |
|---|
| Mean | 4.5531498 |
|---|
| Median Absolute Deviation (MAD) | 3 |
|---|
| Skewness | 0.85081825 |
|---|
| Sum | 443923 |
|---|
| Variance | 19.057746 |
|---|
| Monotonicity | Not monotonic |
|---|
Histogram with fixed size bins (bins=22)
| Value | Count | Frequency (%) |
| 0 | 25071 | 25.7% |
| 1 | 9277 | 9.5% |
| 5 | 8789 | 9.0% |
| 6 | 7802 | 8.0% |
| 7 | 6582 | 6.8% |
| 2 | 6466 | 6.6% |
| 3 | 6200 | 6.4% |
| 4 | 4919 | 5.0% |
| 8 | 4850 | 5.0% |
| 10 | 3899 | 4.0% |
| Other values (12) | 13643 | 14.0% |
| Value | Count | Frequency (%) |
| 0 | 25071 | 25.7% |
| 1 | 9277 | 9.5% |
| 2 | 6466 | 6.6% |
| 3 | 6200 | 6.4% |
| 4 | 4919 | 5.0% |
| 5 | 8789 | 9.0% |
| 6 | 7802 | 8.0% |
| 7 | 6582 | 6.8% |
| 8 | 4850 | 5.0% |
| 9 | 3739 | 3.8% |
| Value | Count | Frequency (%) |
| 21 | 2 | < 0.1% |
| 20 | 2 | < 0.1% |
| 19 | 130 | 0.1% |
| 18 | 124 | 0.1% |
| 17 | 554 | 0.6% |
| 16 | 1167 | 1.2% |
| 15 | 1360 | 1.4% |
| 14 | 1566 | 1.6% |
| 13 | 1712 | 1.8% |
| 12 | 1555 | 1.6% |
| Distinct | 5 |
|---|
| Distinct (%) | < 0.1% |
|---|
| Missing | 0 |
|---|
| Missing (%) | 0.0% |
|---|
| Memory size | 761.8 KiB |
|---|
| 5 | 50770 |
|---|
| 4 | 27562 |
|---|
| 1 | 9907 |
|---|
| 3 | 7282 |
|---|
| 2 | 1977 |
|---|
Length
| Max length | 1 |
|---|
| Median length | 1 |
|---|
| Mean length | 1 |
|---|
| Min length | 1 |
|---|
Characters and Unicode
| Total characters | 97498 |
|---|
| Distinct characters | 5 |
|---|
| Distinct categories | 1 ? |
|---|
| Distinct scripts | 1 ? |
|---|
| Distinct blocks | 1 ? |
|---|
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Sample
| 1st row | 5 |
|---|
| 2nd row | 5 |
|---|
| 3rd row | 5 |
|---|
| 4th row | 5 |
|---|
| 5th row | 5 |
|---|
Common Values
| Value | Count | Frequency (%) |
| 5 | 50770 | 52.1% |
| 4 | 27562 | 28.3% |
| 1 | 9907 | 10.2% |
| 3 | 7282 | 7.5% |
| 2 | 1977 | 2.0% |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| 5 | 50770 | 52.1% |
| 4 | 27562 | 28.3% |
| 1 | 9907 | 10.2% |
| 3 | 7282 | 7.5% |
| 2 | 1977 | 2.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 5 | 50770 | 52.1% |
| 4 | 27562 | 28.3% |
| 1 | 9907 | 10.2% |
| 3 | 7282 | 7.5% |
| 2 | 1977 | 2.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 97498 | 100.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 50770 | 52.1% |
| 4 | 27562 | 28.3% |
| 1 | 9907 | 10.2% |
| 3 | 7282 | 7.5% |
| 2 | 1977 | 2.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 97498 | 100.0% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 5 | 50770 | 52.1% |
| 4 | 27562 | 28.3% |
| 1 | 9907 | 10.2% |
| 3 | 7282 | 7.5% |
| 2 | 1977 | 2.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 97498 | 100.0% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5 | 50770 | 52.1% |
| 4 | 27562 | 28.3% |
| 1 | 9907 | 10.2% |
| 3 | 7282 | 7.5% |
| 2 | 1977 | 2.0% |
| Employee ID | Agent ID | Resolution Time (Days) | Request Category | Issue Type | Severity | Priority | Satisfaction Rate |
|---|
| Employee ID | 1.000 | -0.000 | -0.006 | 0.007 | 0.004 | 0.005 | 0.025 | 0.004 |
|---|
| Agent ID | -0.000 | 1.000 | -0.008 | 0.000 | 0.006 | 0.002 | 0.000 | 0.065 |
|---|
| Resolution Time (Days) | -0.006 | -0.008 | 1.000 | 0.474 | 0.228 | 0.060 | 0.199 | 0.024 |
|---|
| Request Category | 0.007 | 0.000 | 0.474 | 1.000 | 0.001 | 0.000 | 0.004 | 0.000 |
|---|
| Issue Type | 0.004 | 0.006 | 0.228 | 0.001 | 1.000 | 0.133 | 0.000 | 0.004 |
|---|
| Severity | 0.005 | 0.002 | 0.060 | 0.000 | 0.133 | 1.000 | 0.030 | 0.013 |
|---|
| Priority | 0.025 | 0.000 | 0.199 | 0.004 | 0.000 | 0.030 | 1.000 | 0.007 |
|---|
| Satisfaction Rate | 0.004 | 0.065 | 0.024 | 0.000 | 0.004 | 0.013 | 0.007 | 1.000 |
|---|
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
| ID Ticket | Fecha | Employee ID | Agent ID | Request Category | Issue Type | Severity | Priority | Resolution Time (Days) | Satisfaction Rate |
|---|
| 0 | GDDENR-5042564453 | 2016-07-13 | 1735 | 4 | Login Access | IT Error | 0 - Unclasified | 0 - Unassiged | 0 | 5 |
|---|
| 1 | GDDENR-8042508060 | 2016-05-18 | 1566 | 10 | Login Access | IT Error | 0 - Unclasified | 0 - Unassiged | 0 | 5 |
|---|
| 2 | GDDESR-1342539995 | 2016-06-18 | 569 | 29 | System | IT Error | 0 - Unclasified | 0 - Unassiged | 3 | 5 |
|---|
| 3 | GDDTSR-5942488006 | 2016-04-28 | 320 | 40 | System | IT Request | 0 - Unclasified | 0 - Unassiged | 9 | 5 |
|---|
| 4 | GDLEER-0042524120 | 2016-06-03 | 1842 | 31 | Software | IT Error | 2 - Normal | 0 - Unassiged | 0 | 5 |
|---|
| 5 | GDLEER-0142608095 | 2016-08-26 | 59 | 20 | Software | IT Error | 2 - Normal | 0 - Unassiged | 1 | 1 |
|---|
| 6 | GDLEER-0242564650 | 2016-07-13 | 1175 | 36 | Software | IT Error | 2 - Normal | 0 - Unassiged | 2 | 1 |
|---|
| 7 | GDLEER-0542574815 | 2016-07-23 | 561 | 18 | Software | IT Error | 2 - Normal | 0 - Unassiged | 5 | 5 |
|---|
| 8 | GDLEER-0842457219 | 2016-03-28 | 71 | 12 | Software | IT Error | 2 - Normal | 0 - Unassiged | 8 | 5 |
|---|
| 9 | GDLEER-1242542213 | 2016-06-21 | 1831 | 42 | Software | IT Error | 2 - Normal | 0 - Unassiged | 2 | 5 |
|---|
| ID Ticket | Fecha | Employee ID | Agent ID | Request Category | Issue Type | Severity | Priority | Resolution Time (Days) | Satisfaction Rate |
|---|
| 97488 | TWRTSR-3543959748 | 2020-05-08 | 1414 | 47 | System | IT Request | 1 - Minor | 1 - Low | 5 | 4 |
|---|
| 97489 | TWRTSR-4544164065 | 2020-11-29 | 286 | 20 | System | IT Request | 1 - Minor | 1 - Low | 5 | 5 |
|---|
| 97490 | TWRTSR-4944035850 | 2020-07-23 | 935 | 18 | System | IT Request | 1 - Minor | 1 - Low | 9 | 5 |
|---|
| 97491 | TWRTSR-6344049420 | 2020-08-06 | 1302 | 4 | System | IT Request | 3 - Mayor | 1 - Low | 1 | 4 |
|---|
| 97492 | TWRTSR-7943973415 | 2020-05-22 | 491 | 4 | System | IT Request | 1 - Minor | 1 - Low | 19 | 4 |
|---|
| 97493 | TWRTSR-8543883120 | 2020-02-22 | 1142 | 1 | System | IT Request | 3 - Mayor | 1 - Low | 11 | 5 |
|---|
| 97494 | TWRTSR-8744097039 | 2020-09-23 | 223 | 40 | System | IT Request | 1 - Minor | 1 - Low | 7 | 4 |
|---|
| 97495 | TWRTSR-9643846768 | 2020-01-16 | 256 | 7 | System | IT Request | 3 - Mayor | 1 - Low | 13 | 5 |
|---|
| 97496 | TWRTSR-9944138906 | 2020-11-03 | 1060 | 9 | System | IT Request | 1 - Minor | 1 - Low | 9 | 5 |
|---|
| 97497 | TWRTST-8643986162 | 2020-06-04 | 1876 | 41 | System | IT Request | 1 - Minor | 1 - Low | 6 | 4 |
|---|