diff --git a/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_apache.md b/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_apache.md index d49ecd847..24746482b 100644 --- a/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_apache.md +++ b/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_apache.md @@ -3054,259 +3054,6 @@ Output series: +-----------------------------+--------------------------------------------------------+ ``` - -### 4.8 MasterTrain - -#### Usage - -This function is used to train the VAR model based on master data. The model is trained on learning samples consisting of p+1 consecutive non-error points. - -**Name:** MasterTrain - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterTrain as 'org.apache.iotdb.library.anomaly.UDTFMasterTrain'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -SQL for query: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ -``` - -### 4.9 MasterDetect - -#### Usage - -This function is used to detect time series and repair errors based on master data. The VAR model is trained by MasterTrain. - -**Name:** MasterDetect - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple distance of the k-th nearest neighbor in the master data. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. -+ `eta`: The detection threshold. By default, it will be estimated based on the 3-sigma rule. -+ `output_type`: The type of output. 'repair' for repairing and 'anomaly' for anomaly detection. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### Repairing - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### Anomaly Detection - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| true| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 5. Frequency Domain Analysis ### 5.1 Conv @@ -4823,192 +4570,6 @@ Output series: +-----------------------------+-------------------------------------------------+ ``` -### 7.4 MasterRepair - -#### Usage - -This function is used to clean time series with master data. - -**Name**: MasterRepair -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `omega`: The window size. It is a non-negative integer whose unit is millisecond. By default, it will be estimated according to the distances of two tuples with various time differences. -+ `eta`: The distance threshold. It is a positive number. By default, it will be estimated according to the distance distribution of tuples in windows. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple dis- tance of the k-th nearest neighbor in the master data. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -SQL for query: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -Output series: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 8. Series Discovery ### 8.1 ConsecutiveSequences @@ -5263,127 +4824,3 @@ Output Series: |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### 9.2 Representation - -#### Usage - -This function is used to represent a time series. - -**Name:** Representation - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is INT32. The length is `tb*vb`. The timestamps starting from 0 only indicate the order. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### 9.3 RM - -#### Usage - -This function is used to calculate the matching score of two time series according to the representation. - -**Name:** RM - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the matching score. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md b/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md index d317569c2..87e18e973 100644 --- a/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md +++ b/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md @@ -3054,259 +3054,6 @@ Output series: +-----------------------------+--------------------------------------------------------+ ``` - -### 4.8 MasterTrain - -#### Usage - -This function is used to train the VAR model based on master data. The model is trained on learning samples consisting of p+1 consecutive non-error points. - -**Name:** MasterTrain - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterTrain as 'org.apache.iotdb.library.anomaly.UDTFMasterTrain'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -SQL for query: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ -``` - -### 4.9 MasterDetect - -#### Usage - -This function is used to detect time series and repair errors based on master data. The VAR model is trained by MasterTrain. - -**Name:** MasterDetect - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple distance of the k-th nearest neighbor in the master data. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. -+ `eta`: The detection threshold. By default, it will be estimated based on the 3-sigma rule. -+ `output_type`: The type of output. 'repair' for repairing and 'anomaly' for anomaly detection. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### Repairing - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### Anomaly Detection - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| true| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 5. Frequency Domain Analysis ### 5.1 Conv @@ -4882,192 +4629,6 @@ Output series: +-----------------------------+-------------------------------------------------+ ``` -### 7.4 MasterRepair - -#### Usage - -This function is used to clean time series with master data. - -**Name**: MasterRepair -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `omega`: The window size. It is a non-negative integer whose unit is millisecond. By default, it will be estimated according to the distances of two tuples with various time differences. -+ `eta`: The distance threshold. It is a positive number. By default, it will be estimated according to the distance distribution of tuples in windows. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple dis- tance of the k-th nearest neighbor in the master data. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -SQL for query: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -Output series: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 8. Series Discovery ### 8.1 ConsecutiveSequences @@ -5323,126 +4884,3 @@ Output Series: +-----------------------------+---------------------------+ ``` -### 9.2 Representation - -#### Usage - -This function is used to represent a time series. - -**Name:** Representation - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is INT32. The length is `tb*vb`. The timestamps starting from 0 only indicate the order. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### 9.3 RM - -#### Usage - -This function is used to calculate the matching score of two time series according to the representation. - -**Name:** RM - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the matching score. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_apache.md b/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_apache.md index 722a30b52..8f467de61 100644 --- a/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_apache.md +++ b/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_apache.md @@ -3057,259 +3057,6 @@ Output series: +-----------------------------+--------------------------------------------------------+ ``` - -### MasterTrain - -#### Usage - -This function is used to train the VAR model based on master data. The model is trained on learning samples consisting of p+1 consecutive non-error points. - -**Name:** MasterTrain - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterTrain as 'org.apache.iotdb.library.anomaly.UDTFMasterTrain'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -SQL for query: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ -``` - -### MasterDetect - -#### Usage - -This function is used to detect time series and repair errors based on master data. The VAR model is trained by MasterTrain. - -**Name:** MasterDetect - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple distance of the k-th nearest neighbor in the master data. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. -+ `eta`: The detection threshold. By default, it will be estimated based on the 3-sigma rule. -+ `output_type`: The type of output. 'repair' for repairing and 'anomaly' for anomaly detection. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### Repairing - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### Anomaly Detection - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| true| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## Frequency Domain Analysis ### Conv @@ -4824,192 +4571,6 @@ Output series: +-----------------------------+-------------------------------------------------+ ``` -### MasterRepair - -#### Usage - -This function is used to clean time series with master data. - -**Name**: MasterRepair -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `omega`: The window size. It is a non-negative integer whose unit is millisecond. By default, it will be estimated according to the distances of two tuples with various time differences. -+ `eta`: The distance threshold. It is a positive number. By default, it will be estimated according to the distance distribution of tuples in windows. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple dis- tance of the k-th nearest neighbor in the master data. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -SQL for query: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -Output series: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## Series Discovery ### ConsecutiveSequences @@ -5265,126 +4826,3 @@ Output Series: +-----------------------------+---------------------------+ ``` -### Representation - -#### Usage - -This function is used to represent a time series. - -**Name:** Representation - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is INT32. The length is `tb*vb`. The timestamps starting from 0 only indicate the order. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### RM - -#### Usage - -This function is used to calculate the matching score of two time series according to the representation. - -**Name:** RM - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the matching score. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md b/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md index 5289bde10..8eca6741f 100644 --- a/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md +++ b/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md @@ -3056,259 +3056,6 @@ Output series: +-----------------------------+--------------------------------------------------------+ ``` - -### MasterTrain - -#### Usage - -This function is used to train the VAR model based on master data. The model is trained on learning samples consisting of p+1 consecutive non-error points. - -**Name:** MasterTrain - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterTrain as 'org.apache.iotdb.library.anomaly.UDTFMasterTrain'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -SQL for query: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ -``` - -### MasterDetect - -#### Usage - -This function is used to detect time series and repair errors based on master data. The VAR model is trained by MasterTrain. - -**Name:** MasterDetect - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple distance of the k-th nearest neighbor in the master data. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. -+ `eta`: The detection threshold. By default, it will be estimated based on the 3-sigma rule. -+ `output_type`: The type of output. 'repair' for repairing and 'anomaly' for anomaly detection. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### Repairing - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### Anomaly Detection - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| true| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## Frequency Domain Analysis ### Conv @@ -4883,192 +4630,6 @@ Output series: +-----------------------------+-------------------------------------------------+ ``` -### MasterRepair - -#### Usage - -This function is used to clean time series with master data. - -**Name**: MasterRepair -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `omega`: The window size. It is a non-negative integer whose unit is millisecond. By default, it will be estimated according to the distances of two tuples with various time differences. -+ `eta`: The distance threshold. It is a positive number. By default, it will be estimated according to the distance distribution of tuples in windows. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple dis- tance of the k-th nearest neighbor in the master data. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -SQL for query: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -Output series: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## Series Discovery ### ConsecutiveSequences @@ -5324,126 +4885,4 @@ Output Series: +-----------------------------+---------------------------+ ``` -### Representation - -#### Usage - -This function is used to represent a time series. - -**Name:** Representation - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is INT32. The length is `tb*vb`. The timestamps starting from 0 only indicate the order. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### RM - -#### Usage - -This function is used to calculate the matching score of two time series according to the representation. - -**Name:** RM - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the matching score. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` diff --git a/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_apache.md b/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_apache.md index 8985f53e9..0e7598612 100644 --- a/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_apache.md +++ b/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_apache.md @@ -3057,259 +3057,6 @@ Output series: +-----------------------------+--------------------------------------------------------+ ``` - -### MasterTrain - -#### Usage - -This function is used to train the VAR model based on master data. The model is trained on learning samples consisting of p+1 consecutive non-error points. - -**Name:** MasterTrain - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterTrain as 'org.apache.iotdb.library.anomaly.UDTFMasterTrain'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -SQL for query: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ -``` - -### MasterDetect - -#### Usage - -This function is used to detect time series and repair errors based on master data. The VAR model is trained by MasterTrain. - -**Name:** MasterDetect - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple distance of the k-th nearest neighbor in the master data. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. -+ `eta`: The detection threshold. By default, it will be estimated based on the 3-sigma rule. -+ `output_type`: The type of output. 'repair' for repairing and 'anomaly' for anomaly detection. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### Repairing - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### Anomaly Detection - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| true| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## Frequency Domain Analysis ### Conv @@ -4825,192 +4572,6 @@ Output series: +-----------------------------+-------------------------------------------------+ ``` -### MasterRepair - -#### Usage - -This function is used to clean time series with master data. - -**Name**: MasterRepair -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `omega`: The window size. It is a non-negative integer whose unit is millisecond. By default, it will be estimated according to the distances of two tuples with various time differences. -+ `eta`: The distance threshold. It is a positive number. By default, it will be estimated according to the distance distribution of tuples in windows. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple dis- tance of the k-th nearest neighbor in the master data. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -SQL for query: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -Output series: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## Series Discovery ### ConsecutiveSequences @@ -5266,126 +4827,3 @@ Output Series: +-----------------------------+---------------------------+ ``` -### Representation - -#### Usage - -This function is used to represent a time series. - -**Name:** Representation - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is INT32. The length is `tb*vb`. The timestamps starting from 0 only indicate the order. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### RM - -#### Usage - -This function is used to calculate the matching score of two time series according to the representation. - -**Name:** RM - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the matching score. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md b/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md index 5289bde10..abe6fb262 100644 --- a/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md +++ b/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md @@ -3056,259 +3056,6 @@ Output series: +-----------------------------+--------------------------------------------------------+ ``` - -### MasterTrain - -#### Usage - -This function is used to train the VAR model based on master data. The model is trained on learning samples consisting of p+1 consecutive non-error points. - -**Name:** MasterTrain - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterTrain as 'org.apache.iotdb.library.anomaly.UDTFMasterTrain'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -SQL for query: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ -``` - -### MasterDetect - -#### Usage - -This function is used to detect time series and repair errors based on master data. The VAR model is trained by MasterTrain. - -**Name:** MasterDetect - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple distance of the k-th nearest neighbor in the master data. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. -+ `eta`: The detection threshold. By default, it will be estimated based on the 3-sigma rule. -+ `output_type`: The type of output. 'repair' for repairing and 'anomaly' for anomaly detection. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### Repairing - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### Anomaly Detection - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| true| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## Frequency Domain Analysis ### Conv @@ -4883,192 +4630,6 @@ Output series: +-----------------------------+-------------------------------------------------+ ``` -### MasterRepair - -#### Usage - -This function is used to clean time series with master data. - -**Name**: MasterRepair -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `omega`: The window size. It is a non-negative integer whose unit is millisecond. By default, it will be estimated according to the distances of two tuples with various time differences. -+ `eta`: The distance threshold. It is a positive number. By default, it will be estimated according to the distance distribution of tuples in windows. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple dis- tance of the k-th nearest neighbor in the master data. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -SQL for query: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -Output series: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## Series Discovery ### ConsecutiveSequences @@ -5323,127 +4884,3 @@ Output Series: |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### Representation - -#### Usage - -This function is used to represent a time series. - -**Name:** Representation - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is INT32. The length is `tb*vb`. The timestamps starting from 0 only indicate the order. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### RM - -#### Usage - -This function is used to calculate the matching score of two time series according to the representation. - -**Name:** RM - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the matching score. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/UserGuide/latest/SQL-Manual/UDF-Libraries_apache.md b/src/UserGuide/latest/SQL-Manual/UDF-Libraries_apache.md index d49ecd847..24746482b 100644 --- a/src/UserGuide/latest/SQL-Manual/UDF-Libraries_apache.md +++ b/src/UserGuide/latest/SQL-Manual/UDF-Libraries_apache.md @@ -3054,259 +3054,6 @@ Output series: +-----------------------------+--------------------------------------------------------+ ``` - -### 4.8 MasterTrain - -#### Usage - -This function is used to train the VAR model based on master data. The model is trained on learning samples consisting of p+1 consecutive non-error points. - -**Name:** MasterTrain - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterTrain as 'org.apache.iotdb.library.anomaly.UDTFMasterTrain'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -SQL for query: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ -``` - -### 4.9 MasterDetect - -#### Usage - -This function is used to detect time series and repair errors based on master data. The VAR model is trained by MasterTrain. - -**Name:** MasterDetect - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple distance of the k-th nearest neighbor in the master data. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. -+ `eta`: The detection threshold. By default, it will be estimated based on the 3-sigma rule. -+ `output_type`: The type of output. 'repair' for repairing and 'anomaly' for anomaly detection. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### Repairing - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### Anomaly Detection - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| true| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 5. Frequency Domain Analysis ### 5.1 Conv @@ -4823,192 +4570,6 @@ Output series: +-----------------------------+-------------------------------------------------+ ``` -### 7.4 MasterRepair - -#### Usage - -This function is used to clean time series with master data. - -**Name**: MasterRepair -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `omega`: The window size. It is a non-negative integer whose unit is millisecond. By default, it will be estimated according to the distances of two tuples with various time differences. -+ `eta`: The distance threshold. It is a positive number. By default, it will be estimated according to the distance distribution of tuples in windows. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple dis- tance of the k-th nearest neighbor in the master data. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -SQL for query: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -Output series: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 8. Series Discovery ### 8.1 ConsecutiveSequences @@ -5263,127 +4824,3 @@ Output Series: |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### 9.2 Representation - -#### Usage - -This function is used to represent a time series. - -**Name:** Representation - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is INT32. The length is `tb*vb`. The timestamps starting from 0 only indicate the order. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### 9.3 RM - -#### Usage - -This function is used to calculate the matching score of two time series according to the representation. - -**Name:** RM - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the matching score. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md b/src/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md index d317569c2..87e18e973 100644 --- a/src/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md +++ b/src/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md @@ -3054,259 +3054,6 @@ Output series: +-----------------------------+--------------------------------------------------------+ ``` - -### 4.8 MasterTrain - -#### Usage - -This function is used to train the VAR model based on master data. The model is trained on learning samples consisting of p+1 consecutive non-error points. - -**Name:** MasterTrain - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterTrain as 'org.apache.iotdb.library.anomaly.UDTFMasterTrain'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -SQL for query: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ -``` - -### 4.9 MasterDetect - -#### Usage - -This function is used to detect time series and repair errors based on master data. The VAR model is trained by MasterTrain. - -**Name:** MasterDetect - -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `p`: The order of the model. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple distance of the k-th nearest neighbor in the master data. -+ `eta`: The distance threshold. By default, it will be estimated based on the 3-sigma rule. -+ `eta`: The detection threshold. By default, it will be estimated based on the 3-sigma rule. -+ `output_type`: The type of output. 'repair' for repairing and 'anomaly' for anomaly detection. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. - -**Installation** -- Install IoTDB from branch `research/master-detector`. -- Run `mvn spotless:apply`. -- Run `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies`. -- Copy `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar` to `./ext/udf/`. -- Start IoTDB server and run `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'` in client. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### Repairing - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### Anomaly Detection - -SQL for query: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| true| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 5. Frequency Domain Analysis ### 5.1 Conv @@ -4882,192 +4629,6 @@ Output series: +-----------------------------+-------------------------------------------------+ ``` -### 7.4 MasterRepair - -#### Usage - -This function is used to clean time series with master data. - -**Name**: MasterRepair -**Input Series:** Support multiple input series. The types are are in INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `omega`: The window size. It is a non-negative integer whose unit is millisecond. By default, it will be estimated according to the distances of two tuples with various time differences. -+ `eta`: The distance threshold. It is a positive number. By default, it will be estimated according to the distance distribution of tuples in windows. -+ `k`: The number of neighbors in master data. It is a positive integer. By default, it will be estimated according to the tuple dis- tance of the k-th nearest neighbor in the master data. -+ `output_column`: The repaired column to output, defaults to 1 which means output the repair result of the first column. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -SQL for query: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -Output series: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 8. Series Discovery ### 8.1 ConsecutiveSequences @@ -5323,126 +4884,3 @@ Output Series: +-----------------------------+---------------------------+ ``` -### 9.2 Representation - -#### Usage - -This function is used to represent a time series. - -**Name:** Representation - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is INT32. The length is `tb*vb`. The timestamps starting from 0 only indicate the order. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### 9.3 RM - -#### Usage - -This function is used to calculate the matching score of two time series according to the representation. - -**Name:** RM - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `tb`: The number of timestamp blocks. Its default value is 10. -- `vb`: The number of value blocks. Its default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the matching score. - -**Note:** - -- Parameters `tb` and `vb` should be positive integers. - -#### Examples - -##### Assigning Window Size and Dimension - -Input Series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_apache.md b/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_apache.md index d33ad35f7..b03ffe964 100644 --- a/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_apache.md +++ b/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_apache.md @@ -3110,262 +3110,6 @@ select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test +-----------------------------+--------------------------------------------------------+ ``` -### 4.8 MasterTrain - -#### 函数简介 - -本函数基于主数据训练VAR预测模型。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由连续p+1个非错误值作为训练样本训练VAR模型,输出训练后的模型参数。 - -**函数名:** MasterTrain - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 - -**输出序列:** 输出单个序列,类型为DOUBLE。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterTrain as org.apache.iotdb.library.anomaly.UDTFMasterTrain'`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ - -``` - -### 4.9 MasterDetect - -#### 函数简介 - -本函数基于主数据检测并修复时间序列中的错误值。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由MasterTrain训练的模型进行时间序列预测,错误值将由预测值及主数据共同修复。 - -**函数名:** MasterDetect - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `beta`:异常值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `output_type`:输出结果类型,可选'repair'或'anomaly',即输出修复结果或异常检测结果,在缺省情况下默认为'repair'。 -+ `output_column`:输出列的序号,默认为1,即输出第一列的修复结果。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'`。 - -**输出序列:** 输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### 修复 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### 异常检测 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| false| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 5. 频域分析 ### 5.1 Conv @@ -4927,191 +4671,6 @@ select valuerepair(s1,'method'='LsGreedy') from root.test.d2 +-----------------------------+-------------------------------------------------+ ``` -### 7.4 MasterRepair - -#### 函数简介 - -本函数实现基于主数据的时间序列数据修复。 - -**函数名:**MasterRepair - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `omega`:算法窗口大小,非负整数(单位为毫秒), 在缺省情况下,算法根据不同时间差下的两个元组距离自动估计该参数。 -- `eta`:算法距离阈值,正数, 在缺省情况下,算法根据窗口中元组的距离分布自动估计该参数。 -- `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -- `output_column`:输出列的序号,默认输出第一列的修复结果。 - -**输出序列:**输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -输出序列: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 8. 序列发现 ### 8.1 ConsecutiveSequences @@ -5366,127 +4925,3 @@ select ar(s0,"p"="2") from root.test.d0 |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### 9.2 Representation - -#### 函数简介 - -本函数用于时间序列的表示。 - -**函数名:** Representation - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为INT32,长度为`tb*vb`。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### 9.3 RM - -#### 函数简介 - -本函数用于基于时间序列表示的匹配度。 - -**函数名:** RM - -**输入序列:** 仅支持两个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度为`1`。序列的时间戳从0开始,序列仅有一个数据点,其时间戳为0,值为两个时间序列的匹配度。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md b/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md index a1283b5b3..60be7891c 100644 --- a/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md +++ b/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md @@ -3110,262 +3110,6 @@ select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test +-----------------------------+--------------------------------------------------------+ ``` -### 4.8 MasterTrain - -#### 函数简介 - -本函数基于主数据训练VAR预测模型。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由连续p+1个非错误值作为训练样本训练VAR模型,输出训练后的模型参数。 - -**函数名:** MasterTrain - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 - -**输出序列:** 输出单个序列,类型为DOUBLE。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterTrain as org.apache.iotdb.library.anomaly.UDTFMasterTrain'`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ - -``` - -### 4.9 MasterDetect - -#### 函数简介 - -本函数基于主数据检测并修复时间序列中的错误值。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由MasterTrain训练的模型进行时间序列预测,错误值将由预测值及主数据共同修复。 - -**函数名:** MasterDetect - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `beta`:异常值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `output_type`:输出结果类型,可选'repair'或'anomaly',即输出修复结果或异常检测结果,在缺省情况下默认为'repair'。 -+ `output_column`:输出列的序号,默认为1,即输出第一列的修复结果。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'`。 - -**输出序列:** 输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### 修复 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### 异常检测 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| false| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 5. 频域分析 ### 5.1 Conv @@ -4915,191 +4659,6 @@ select valuerepair(s1,'method'='LsGreedy') from root.test.d2 +-----------------------------+-------------------------------------------------+ ``` -### 7.4 MasterRepair - -#### 函数简介 - -本函数实现基于主数据的时间序列数据修复。 - -**函数名:**MasterRepair - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `omega`:算法窗口大小,非负整数(单位为毫秒), 在缺省情况下,算法根据不同时间差下的两个元组距离自动估计该参数。 -- `eta`:算法距离阈值,正数, 在缺省情况下,算法根据窗口中元组的距离分布自动估计该参数。 -- `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -- `output_column`:输出列的序号,默认输出第一列的修复结果。 - -**输出序列:**输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -输出序列: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 8. 序列发现 ### 8.1 ConsecutiveSequences @@ -5354,127 +4913,3 @@ select ar(s0,"p"="2") from root.test.d0 |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### 9.2 Representation - -#### 函数简介 - -本函数用于时间序列的表示。 - -**函数名:** Representation - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为INT32,长度为`tb*vb`。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### 9.3 RM - -#### 函数简介 - -本函数用于基于时间序列表示的匹配度。 - -**函数名:** RM - -**输入序列:** 仅支持两个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度为`1`。序列的时间戳从0开始,序列仅有一个数据点,其时间戳为0,值为两个时间序列的匹配度。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_apache.md b/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_apache.md index bc7d45871..4f20952fd 100644 --- a/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_apache.md +++ b/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_apache.md @@ -3110,262 +3110,6 @@ select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test +-----------------------------+--------------------------------------------------------+ ``` -### MasterTrain - -#### 函数简介 - -本函数基于主数据训练VAR预测模型。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由连续p+1个非错误值作为训练样本训练VAR模型,输出训练后的模型参数。 - -**函数名:** MasterTrain - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 - -**输出序列:** 输出单个序列,类型为DOUBLE。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterTrain as org.apache.iotdb.library.anomaly.UDTFMasterTrain'`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ - -``` - -### MasterDetect - -#### 函数简介 - -本函数基于主数据检测并修复时间序列中的错误值。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由MasterTrain训练的模型进行时间序列预测,错误值将由预测值及主数据共同修复。 - -**函数名:** MasterDetect - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `beta`:异常值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `output_type`:输出结果类型,可选'repair'或'anomaly',即输出修复结果或异常检测结果,在缺省情况下默认为'repair'。 -+ `output_column`:输出列的序号,默认为1,即输出第一列的修复结果。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'`。 - -**输出序列:** 输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### 修复 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### 异常检测 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| false| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 频域分析 ### Conv @@ -4928,191 +4672,6 @@ select valuerepair(s1,'method'='LsGreedy') from root.test.d2 +-----------------------------+-------------------------------------------------+ ``` -### MasterRepair - -#### 函数简介 - -本函数实现基于主数据的时间序列数据修复。 - -**函数名:**MasterRepair - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `omega`:算法窗口大小,非负整数(单位为毫秒), 在缺省情况下,算法根据不同时间差下的两个元组距离自动估计该参数。 -- `eta`:算法距离阈值,正数, 在缺省情况下,算法根据窗口中元组的距离分布自动估计该参数。 -- `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -- `output_column`:输出列的序号,默认输出第一列的修复结果。 - -**输出序列:**输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -输出序列: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 序列发现 ### ConsecutiveSequences @@ -5367,127 +4926,3 @@ select ar(s0,"p"="2") from root.test.d0 |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### Representation - -#### 函数简介 - -本函数用于时间序列的表示。 - -**函数名:** Representation - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为INT32,长度为`tb*vb`。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### RM - -#### 函数简介 - -本函数用于基于时间序列表示的匹配度。 - -**函数名:** RM - -**输入序列:** 仅支持两个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度为`1`。序列的时间戳从0开始,序列仅有一个数据点,其时间戳为0,值为两个时间序列的匹配度。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md b/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md index cdf8428e0..8c621861b 100644 --- a/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md +++ b/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md @@ -3110,262 +3110,6 @@ select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test +-----------------------------+--------------------------------------------------------+ ``` -### MasterTrain - -#### 函数简介 - -本函数基于主数据训练VAR预测模型。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由连续p+1个非错误值作为训练样本训练VAR模型,输出训练后的模型参数。 - -**函数名:** MasterTrain - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 - -**输出序列:** 输出单个序列,类型为DOUBLE。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterTrain as org.apache.iotdb.library.anomaly.UDTFMasterTrain'`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ - -``` - -### MasterDetect - -#### 函数简介 - -本函数基于主数据检测并修复时间序列中的错误值。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由MasterTrain训练的模型进行时间序列预测,错误值将由预测值及主数据共同修复。 - -**函数名:** MasterDetect - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `beta`:异常值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `output_type`:输出结果类型,可选'repair'或'anomaly',即输出修复结果或异常检测结果,在缺省情况下默认为'repair'。 -+ `output_column`:输出列的序号,默认为1,即输出第一列的修复结果。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'`。 - -**输出序列:** 输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### 修复 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### 异常检测 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| false| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 频域分析 ### Conv @@ -4914,191 +4658,6 @@ select valuerepair(s1,'method'='LsGreedy') from root.test.d2 +-----------------------------+-------------------------------------------------+ ``` -### MasterRepair - -#### 函数简介 - -本函数实现基于主数据的时间序列数据修复。 - -**函数名:**MasterRepair - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `omega`:算法窗口大小,非负整数(单位为毫秒), 在缺省情况下,算法根据不同时间差下的两个元组距离自动估计该参数。 -- `eta`:算法距离阈值,正数, 在缺省情况下,算法根据窗口中元组的距离分布自动估计该参数。 -- `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -- `output_column`:输出列的序号,默认输出第一列的修复结果。 - -**输出序列:**输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -输出序列: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 序列发现 ### ConsecutiveSequences @@ -5353,127 +4912,3 @@ select ar(s0,"p"="2") from root.test.d0 |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### Representation - -#### 函数简介 - -本函数用于时间序列的表示。 - -**函数名:** Representation - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为INT32,长度为`tb*vb`。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### RM - -#### 函数简介 - -本函数用于基于时间序列表示的匹配度。 - -**函数名:** RM - -**输入序列:** 仅支持两个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度为`1`。序列的时间戳从0开始,序列仅有一个数据点,其时间戳为0,值为两个时间序列的匹配度。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_apache.md b/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_apache.md index 0db5452fa..563678ca7 100644 --- a/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_apache.md +++ b/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_apache.md @@ -3110,262 +3110,6 @@ select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test +-----------------------------+--------------------------------------------------------+ ``` -### MasterTrain - -#### 函数简介 - -本函数基于主数据训练VAR预测模型。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由连续p+1个非错误值作为训练样本训练VAR模型,输出训练后的模型参数。 - -**函数名:** MasterTrain - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 - -**输出序列:** 输出单个序列,类型为DOUBLE。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterTrain as org.apache.iotdb.library.anomaly.UDTFMasterTrain'`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ - -``` - -### MasterDetect - -#### 函数简介 - -本函数基于主数据检测并修复时间序列中的错误值。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由MasterTrain训练的模型进行时间序列预测,错误值将由预测值及主数据共同修复。 - -**函数名:** MasterDetect - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `beta`:异常值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `output_type`:输出结果类型,可选'repair'或'anomaly',即输出修复结果或异常检测结果,在缺省情况下默认为'repair'。 -+ `output_column`:输出列的序号,默认为1,即输出第一列的修复结果。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'`。 - -**输出序列:** 输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### 修复 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### 异常检测 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| false| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 频域分析 ### Conv @@ -4927,191 +4671,6 @@ select valuerepair(s1,'method'='LsGreedy') from root.test.d2 +-----------------------------+-------------------------------------------------+ ``` -### MasterRepair - -#### 函数简介 - -本函数实现基于主数据的时间序列数据修复。 - -**函数名:**MasterRepair - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `omega`:算法窗口大小,非负整数(单位为毫秒), 在缺省情况下,算法根据不同时间差下的两个元组距离自动估计该参数。 -- `eta`:算法距离阈值,正数, 在缺省情况下,算法根据窗口中元组的距离分布自动估计该参数。 -- `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -- `output_column`:输出列的序号,默认输出第一列的修复结果。 - -**输出序列:**输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -输出序列: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 序列发现 ### ConsecutiveSequences @@ -5277,26 +4836,6 @@ select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 +-----------------------------+--------------------------------------------------------------------+ ``` - ## 机器学习 @@ -5366,127 +4905,3 @@ select ar(s0,"p"="2") from root.test.d0 |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### Representation - -#### 函数简介 - -本函数用于时间序列的表示。 - -**函数名:** Representation - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为INT32,长度为`tb*vb`。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### RM - -#### 函数简介 - -本函数用于基于时间序列表示的匹配度。 - -**函数名:** RM - -**输入序列:** 仅支持两个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度为`1`。序列的时间戳从0开始,序列仅有一个数据点,其时间戳为0,值为两个时间序列的匹配度。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md b/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md index cdf8428e0..ddb3d8d99 100644 --- a/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md +++ b/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md @@ -3110,262 +3110,6 @@ select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test +-----------------------------+--------------------------------------------------------+ ``` -### MasterTrain - -#### 函数简介 - -本函数基于主数据训练VAR预测模型。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由连续p+1个非错误值作为训练样本训练VAR模型,输出训练后的模型参数。 - -**函数名:** MasterTrain - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 - -**输出序列:** 输出单个序列,类型为DOUBLE。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterTrain as org.apache.iotdb.library.anomaly.UDTFMasterTrain'`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ - -``` - -### MasterDetect - -#### 函数简介 - -本函数基于主数据检测并修复时间序列中的错误值。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由MasterTrain训练的模型进行时间序列预测,错误值将由预测值及主数据共同修复。 - -**函数名:** MasterDetect - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `beta`:异常值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `output_type`:输出结果类型,可选'repair'或'anomaly',即输出修复结果或异常检测结果,在缺省情况下默认为'repair'。 -+ `output_column`:输出列的序号,默认为1,即输出第一列的修复结果。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'`。 - -**输出序列:** 输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### 修复 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### 异常检测 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| false| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 频域分析 ### Conv @@ -4914,190 +4658,6 @@ select valuerepair(s1,'method'='LsGreedy') from root.test.d2 +-----------------------------+-------------------------------------------------+ ``` -### MasterRepair - -#### 函数简介 - -本函数实现基于主数据的时间序列数据修复。 - -**函数名:**MasterRepair - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `omega`:算法窗口大小,非负整数(单位为毫秒), 在缺省情况下,算法根据不同时间差下的两个元组距离自动估计该参数。 -- `eta`:算法距离阈值,正数, 在缺省情况下,算法根据窗口中元组的距离分布自动估计该参数。 -- `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -- `output_column`:输出列的序号,默认输出第一列的修复结果。 - -**输出序列:**输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -输出序列: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - ## 序列发现 @@ -5264,27 +4824,6 @@ select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 +-----------------------------+--------------------------------------------------------------------+ ``` - - ## 机器学习 ### AR @@ -5353,127 +4892,3 @@ select ar(s0,"p"="2") from root.test.d0 |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### Representation - -#### 函数简介 - -本函数用于时间序列的表示。 - -**函数名:** Representation - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为INT32,长度为`tb*vb`。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### RM - -#### 函数简介 - -本函数用于基于时间序列表示的匹配度。 - -**函数名:** RM - -**输入序列:** 仅支持两个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度为`1`。序列的时间戳从0开始,序列仅有一个数据点,其时间戳为0,值为两个时间序列的匹配度。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_apache.md b/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_apache.md index d33ad35f7..b03ffe964 100644 --- a/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_apache.md +++ b/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_apache.md @@ -3110,262 +3110,6 @@ select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test +-----------------------------+--------------------------------------------------------+ ``` -### 4.8 MasterTrain - -#### 函数简介 - -本函数基于主数据训练VAR预测模型。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由连续p+1个非错误值作为训练样本训练VAR模型,输出训练后的模型参数。 - -**函数名:** MasterTrain - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 - -**输出序列:** 输出单个序列,类型为DOUBLE。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterTrain as org.apache.iotdb.library.anomaly.UDTFMasterTrain'`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ - -``` - -### 4.9 MasterDetect - -#### 函数简介 - -本函数基于主数据检测并修复时间序列中的错误值。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由MasterTrain训练的模型进行时间序列预测,错误值将由预测值及主数据共同修复。 - -**函数名:** MasterDetect - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `beta`:异常值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `output_type`:输出结果类型,可选'repair'或'anomaly',即输出修复结果或异常检测结果,在缺省情况下默认为'repair'。 -+ `output_column`:输出列的序号,默认为1,即输出第一列的修复结果。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'`。 - -**输出序列:** 输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### 修复 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### 异常检测 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| false| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 5. 频域分析 ### 5.1 Conv @@ -4927,191 +4671,6 @@ select valuerepair(s1,'method'='LsGreedy') from root.test.d2 +-----------------------------+-------------------------------------------------+ ``` -### 7.4 MasterRepair - -#### 函数简介 - -本函数实现基于主数据的时间序列数据修复。 - -**函数名:**MasterRepair - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `omega`:算法窗口大小,非负整数(单位为毫秒), 在缺省情况下,算法根据不同时间差下的两个元组距离自动估计该参数。 -- `eta`:算法距离阈值,正数, 在缺省情况下,算法根据窗口中元组的距离分布自动估计该参数。 -- `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -- `output_column`:输出列的序号,默认输出第一列的修复结果。 - -**输出序列:**输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -输出序列: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 8. 序列发现 ### 8.1 ConsecutiveSequences @@ -5366,127 +4925,3 @@ select ar(s0,"p"="2") from root.test.d0 |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### 9.2 Representation - -#### 函数简介 - -本函数用于时间序列的表示。 - -**函数名:** Representation - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为INT32,长度为`tb*vb`。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### 9.3 RM - -#### 函数简介 - -本函数用于基于时间序列表示的匹配度。 - -**函数名:** RM - -**输入序列:** 仅支持两个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度为`1`。序列的时间戳从0开始,序列仅有一个数据点,其时间戳为0,值为两个时间序列的匹配度。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` - diff --git a/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md b/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md index a1283b5b3..60be7891c 100644 --- a/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md +++ b/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md @@ -3110,262 +3110,6 @@ select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test +-----------------------------+--------------------------------------------------------+ ``` -### 4.8 MasterTrain - -#### 函数简介 - -本函数基于主数据训练VAR预测模型。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由连续p+1个非错误值作为训练样本训练VAR模型,输出训练后的模型参数。 - -**函数名:** MasterTrain - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 - -**输出序列:** 输出单个序列,类型为DOUBLE。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterTrain as org.apache.iotdb.library.anomaly.UDTFMasterTrain'`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| -+-----------------------------+------------+------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| -+-----------------------------+------------+------------+--------------+--------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------------+ -| Time|MasterTrain(root.test.lo, root.test.la, root.test.m_lo, root.test.m_la, "p"="3", "eta"="1.0")| -+-----------------------------+---------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| -0.07863708480736269| -+-----------------------------+---------------------------------------------------------------------------------------------+ - -``` - -### 4.9 MasterDetect - -#### 函数简介 - -本函数基于主数据检测并修复时间序列中的错误值。将根据提供的主数据判断时间序列中的数据点是否为错误值,并由MasterTrain训练的模型进行时间序列预测,错误值将由预测值及主数据共同修复。 - -**函数名:** MasterDetect - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `p`:模型阶数。 -+ `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -+ `eta`:错误值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `beta`:异常值判定阈值,在缺省情况下,算法根据3-sigma原则自动估计该参数。 -+ `output_type`:输出结果类型,可选'repair'或'anomaly',即输出修复结果或异常检测结果,在缺省情况下默认为'repair'。 -+ `output_column`:输出列的序号,默认为1,即输出第一列的修复结果。 - -**安装方式:** - -- 从IoTDB代码仓库下载`research/master-detector`分支代码到本地 -- 在根目录运行 `mvn spotless:apply` -- 在根目录运行 `mvn clean package -pl library-udf -DskipTests -am -P get-jar-with-dependencies` 编译项目 -- 将 `./library-UDF/target/library-udf-1.2.0-SNAPSHOT-jar-with-dependencies.jar`复制到IoTDB服务器的`./ext/udf/` 路径下。 -- 启动 IoTDB服务器,在客户端中执行 `create function MasterDetect as 'org.apache.iotdb.library.anomaly.UDTFMasterDetect'`。 - -**输出序列:** 输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -| Time|root.test.lo|root.test.la|root.test.m_la|root.test.m_lo| root.test.model| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -|1970-01-01T08:00:00.001+08:00| 39.99982556| 116.327274| 116.3271939| 39.99984748| 0.13656607660463288| -|1970-01-01T08:00:00.002+08:00| 39.99983865| 116.327305| 116.3272269| 39.99984748| 0.8291884323013894| -|1970-01-01T08:00:00.003+08:00| 40.00019038| 116.3273291| 116.3272634| 39.99984769| 0.05012816073171693| -|1970-01-01T08:00:00.004+08:00| 39.99982556| 116.327342| 116.3273015| 39.9998483| -0.5495287787485761| -|1970-01-01T08:00:00.005+08:00| 39.99982991| 116.3273744| 116.327339| 39.99984892| 0.03740486307345578| -|1970-01-01T08:00:00.006+08:00| 39.99982716| 116.3274117| 116.3273759| 39.99984892| 1.0500132150475212| -|1970-01-01T08:00:00.007+08:00| 39.9998259| 116.3274396| 116.3274163| 39.99984953| 0.04583944643116993| -|1970-01-01T08:00:00.008+08:00| 39.99982597| 116.3274668| 116.3274525| 39.99985014|-0.07863708480736269| -|1970-01-01T08:00:00.009+08:00| 39.99982226| 116.3275026| 116.3274915| 39.99985076| null| -|1970-01-01T08:00:00.010+08:00| 39.99980988| 116.3274967| 116.3275235| 39.99985137| null| -|1970-01-01T08:00:00.011+08:00| 39.99984873| 116.3274929| 116.3275611| 39.99985199| null| -|1970-01-01T08:00:00.012+08:00| 39.99981589| 116.3274745| 116.3275974| 39.9998526| null| -|1970-01-01T08:00:00.013+08:00| 39.9998259| 116.3275095| 116.3276338| 39.99985384| null| -|1970-01-01T08:00:00.014+08:00| 39.99984873| 116.3274787| 116.3276695| 39.99985446| null| -|1970-01-01T08:00:00.015+08:00| 39.9998343| 116.3274693| 116.3277045| 39.99985569| null| -|1970-01-01T08:00:00.016+08:00| 39.99983316| 116.3274941| 116.3277389| 39.99985631| null| -|1970-01-01T08:00:00.017+08:00| 39.99983311| 116.3275401| 116.3277747| 39.99985693| null| -|1970-01-01T08:00:00.018+08:00| 39.99984113| 116.3275713| 116.3278041| 39.99985756| null| -|1970-01-01T08:00:00.019+08:00| 39.99983602| 116.3276003| 116.3278379| 39.99985818| null| -|1970-01-01T08:00:00.020+08:00| 39.9998355| 116.3276308| 116.3278723| 39.9998588| null| -|1970-01-01T08:00:00.021+08:00| 40.00012176| 116.3276107| 116.3279026| 39.99985942| null| -|1970-01-01T08:00:00.022+08:00| 39.9998404| 116.3276684| null| null| null| -|1970-01-01T08:00:00.023+08:00| 39.99983942| 116.3277016| null| null| null| -|1970-01-01T08:00:00.024+08:00| 39.99984113| 116.3277284| null| null| null| -|1970-01-01T08:00:00.025+08:00| 39.99984283| 116.3277562| null| null| null| -+-----------------------------+------------+------------+--------------+--------------+--------------------+ -``` - -##### 修复 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0')| -+-----------------------------+--------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 116.327274| -|1970-01-01T08:00:00.002+08:00| 116.327305| -|1970-01-01T08:00:00.003+08:00| 116.3273291| -|1970-01-01T08:00:00.004+08:00| 116.327342| -|1970-01-01T08:00:00.005+08:00| 116.3273744| -|1970-01-01T08:00:00.006+08:00| 116.3274117| -|1970-01-01T08:00:00.007+08:00| 116.3274396| -|1970-01-01T08:00:00.008+08:00| 116.3274668| -|1970-01-01T08:00:00.009+08:00| 116.3275026| -|1970-01-01T08:00:00.010+08:00| 116.3274967| -|1970-01-01T08:00:00.011+08:00| 116.3274929| -|1970-01-01T08:00:00.012+08:00| 116.3274745| -|1970-01-01T08:00:00.013+08:00| 116.3275095| -|1970-01-01T08:00:00.014+08:00| 116.3274787| -|1970-01-01T08:00:00.015+08:00| 116.3274693| -|1970-01-01T08:00:00.016+08:00| 116.3274941| -|1970-01-01T08:00:00.017+08:00| 116.3275401| -|1970-01-01T08:00:00.018+08:00| 116.3275713| -|1970-01-01T08:00:00.019+08:00| 116.3276003| -|1970-01-01T08:00:00.020+08:00| 116.3276308| -|1970-01-01T08:00:00.021+08:00| 116.3276338| -|1970-01-01T08:00:00.022+08:00| 116.3276684| -|1970-01-01T08:00:00.023+08:00| 116.3277016| -|1970-01-01T08:00:00.024+08:00| 116.3277284| -|1970-01-01T08:00:00.025+08:00| 116.3277562| -+-----------------------------+--------------------------------------------------------------------------------------+ -``` - -##### 异常检测 - -用于查询的 SQL 语句: - -```sql -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------------------------------+ -| Time|MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0')| -+-----------------------------+---------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| false| -|1970-01-01T08:00:00.002+08:00| false| -|1970-01-01T08:00:00.003+08:00| false| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| true| -|1970-01-01T08:00:00.006+08:00| false| -|1970-01-01T08:00:00.007+08:00| false| -|1970-01-01T08:00:00.008+08:00| false| -|1970-01-01T08:00:00.009+08:00| false| -|1970-01-01T08:00:00.010+08:00| false| -|1970-01-01T08:00:00.011+08:00| false| -|1970-01-01T08:00:00.012+08:00| false| -|1970-01-01T08:00:00.013+08:00| false| -|1970-01-01T08:00:00.014+08:00| true| -|1970-01-01T08:00:00.015+08:00| false| -|1970-01-01T08:00:00.016+08:00| false| -|1970-01-01T08:00:00.017+08:00| false| -|1970-01-01T08:00:00.018+08:00| false| -|1970-01-01T08:00:00.019+08:00| false| -|1970-01-01T08:00:00.020+08:00| false| -|1970-01-01T08:00:00.021+08:00| false| -|1970-01-01T08:00:00.022+08:00| false| -|1970-01-01T08:00:00.023+08:00| false| -|1970-01-01T08:00:00.024+08:00| false| -|1970-01-01T08:00:00.025+08:00| false| -+-----------------------------+---------------------------------------------------------------------------------------+ -``` - - - ## 5. 频域分析 ### 5.1 Conv @@ -4915,191 +4659,6 @@ select valuerepair(s1,'method'='LsGreedy') from root.test.d2 +-----------------------------+-------------------------------------------------+ ``` -### 7.4 MasterRepair - -#### 函数简介 - -本函数实现基于主数据的时间序列数据修复。 - -**函数名:**MasterRepair - -**输入序列:** 支持多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `omega`:算法窗口大小,非负整数(单位为毫秒), 在缺省情况下,算法根据不同时间差下的两个元组距离自动估计该参数。 -- `eta`:算法距离阈值,正数, 在缺省情况下,算法根据窗口中元组的距离分布自动估计该参数。 -- `k`:主数据中的近邻数量,正整数, 在缺省情况下,算法根据主数据中的k个近邻的元组距离自动估计该参数。 -- `output_column`:输出列的序号,默认输出第一列的修复结果。 - -**输出序列:**输出单个序列,类型与输入数据中对应列的类型相同,序列为输入列修复后的结果。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+------------+------------+------------+------------+------------+ -| Time|root.test.t1|root.test.t2|root.test.t3|root.test.m1|root.test.m2|root.test.m3| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -|2021-07-01T12:00:01.000+08:00| 1704| 1154.55| 0.195| 1704| 1154.55| 0.195| -|2021-07-01T12:00:02.000+08:00| 1702| 1152.30| 0.193| 1702| 1152.30| 0.193| -|2021-07-01T12:00:03.000+08:00| 1702| 1148.65| 0.192| 1702| 1148.65| 0.192| -|2021-07-01T12:00:04.000+08:00| 1701| 1145.20| 0.194| 1701| 1145.20| 0.194| -|2021-07-01T12:00:07.000+08:00| 1703| 1150.55| 0.195| 1703| 1150.55| 0.195| -|2021-07-01T12:00:08.000+08:00| 1694| 1151.55| 0.193| 1704| 1151.55| 0.193| -|2021-07-01T12:01:09.000+08:00| 1705| 1153.55| 0.194| 1705| 1153.55| 0.194| -|2021-07-01T12:01:10.000+08:00| 1706| 1152.30| 0.190| 1706| 1152.30| 0.190| -+-----------------------------+------------+------------+------------+------------+------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test -``` - -输出序列: - - -``` -+-----------------------------+-------------------------------------------------------------------------------------------+ -| Time|MasterRepair(root.test.t1,root.test.t2,root.test.t3,root.test.m1,root.test.m2,root.test.m3)| -+-----------------------------+-------------------------------------------------------------------------------------------+ -|2021-07-01T12:00:01.000+08:00| 1704| -|2021-07-01T12:00:02.000+08:00| 1702| -|2021-07-01T12:00:03.000+08:00| 1702| -|2021-07-01T12:00:04.000+08:00| 1701| -|2021-07-01T12:00:07.000+08:00| 1703| -|2021-07-01T12:00:08.000+08:00| 1704| -|2021-07-01T12:01:09.000+08:00| 1705| -|2021-07-01T12:01:10.000+08:00| 1706| -+-----------------------------+-------------------------------------------------------------------------------------------+ -``` - - - ## 8. 序列发现 ### 8.1 ConsecutiveSequences @@ -5354,127 +4913,3 @@ select ar(s0,"p"="2") from root.test.d0 |1970-01-01T08:00:00.002+08:00| -0.2571| +-----------------------------+---------------------------+ ``` - -### 9.2 Representation - -#### 函数简介 - -本函数用于时间序列的表示。 - -**函数名:** Representation - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为INT32,长度为`tb*vb`。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select representation(s0,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|representation(root.test.d0.s0,"tb"="3","vb"="2")| -+-----------------------------+-------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1| -|1970-01-01T08:00:00.002+08:00| 1| -|1970-01-01T08:00:00.003+08:00| 0| -|1970-01-01T08:00:00.004+08:00| 0| -|1970-01-01T08:00:00.005+08:00| 1| -|1970-01-01T08:00:00.006+08:00| 1| -+-----------------------------+-------------------------------------------------+ -``` - -### 9.3 RM - -#### 函数简介 - -本函数用于基于时间序列表示的匹配度。 - -**函数名:** RM - -**输入序列:** 仅支持两个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `tb`:时间分块数量。默认为10。 -- `vb`:值分块数量。默认为10。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度为`1`。序列的时间戳从0开始,序列仅有一个数据点,其时间戳为0,值为两个时间序列的匹配度。 - -**提示:** - -- `tb `,`vb`应为正整数。 - -#### 使用示例 - -##### 指定时间分块数量、值分块数量 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d0.s0|root.test.d0.s1 -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:03.000+08:00| -3.0| -3.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| 4.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------+ -| Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")| -+-----------------------------+-----------------------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 1.00| -+-----------------------------+-----------------------------------------------------+ -``` -