Shankar Jayaratnam commited on
Commit
a2692bc
·
verified ·
1 Parent(s): 4b93946

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -14
README.md CHANGED
@@ -13,23 +13,19 @@ pipeline_tag: zero-shot-classification
13
  # Model Card for Model ID
14
 
15
  <!-- Provide a quick summary of what the model is/does. -->
 
16
 
17
  ## Model Details
18
 
19
  ### Model Description
20
 
21
  <!-- Provide a longer summary of what this model is. -->
22
-
23
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
24
 
25
- - **Developed by:** [More Information Needed]
26
- - **Funded by [optional]:** [More Information Needed]
27
- - **Shared by [optional]:** [More Information Needed]
28
- - **Model type:** [More Information Needed]
29
- - **Language(s) (NLP):** [More Information Needed]
30
- - **License:** [More Information Needed]
31
- - **Finetuned from model [optional]:** [More Information Needed]
32
-
33
 
34
  ### Model Sources [optional]
35
 
@@ -46,44 +42,89 @@ This is the model card of a 🤗 transformers model that has been pushed on the
46
  ### Direct Use
47
 
48
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
 
 
 
49
 
50
  [More Information Needed]
51
 
52
  ### Downstream Use [optional]
53
 
54
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
 
 
55
 
56
  [More Information Needed]
57
 
58
  ### Out-of-Scope Use
59
 
60
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
 
61
 
62
  [More Information Needed]
63
 
64
  ## Bias, Risks, and Limitations
65
 
66
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 
 
 
 
 
 
 
 
 
 
 
67
 
68
  [More Information Needed]
69
 
70
  ### Recommendations
71
 
72
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
73
-
74
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 
75
 
76
  ## How to Get Started with the Model
77
 
 
78
 
 
 
 
 
 
 
79
 
80
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  ## Training Details
83
 
84
  ### Training Data
85
 
86
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
 
87
 
88
  [More Information Needed]
89
 
@@ -93,6 +134,20 @@ Users (both direct and downstream) should be made aware of the risks, biases and
93
 
94
  #### Preprocessing [optional]
95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
  [More Information Needed]
97
 
98
 
@@ -103,6 +158,9 @@ Users (both direct and downstream) should be made aware of the risks, biases and
103
  #### Speeds, Sizes, Times [optional]
104
 
105
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
 
 
106
 
107
  [More Information Needed]
108
 
@@ -115,12 +173,20 @@ Users (both direct and downstream) should be made aware of the risks, biases and
115
  #### Testing Data
116
 
117
  <!-- This should link to a Dataset Card if possible. -->
 
118
 
119
  [More Information Needed]
120
 
121
  #### Factors
122
 
123
  <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
 
 
 
 
 
 
 
124
 
125
  [More Information Needed]
126
 
@@ -164,17 +230,18 @@ Runs on both GPU A100 and Esperanto ET-SoC
164
 
165
  #### Software
166
 
167
- Use Pytorch
168
 
169
  ## Citation [optional]
170
 
171
  Esperanto Blog :
172
 
173
-
174
  ## Model Card Authors [optional]
175
 
176
  Sivakrishna Yaganti and Shankar Jayaratnam
177
 
178
  ## Model Card Contact
179
 
 
 
180
  [More Information Needed]
 
13
  # Model Card for Model ID
14
 
15
  <!-- Provide a quick summary of what the model is/does. -->
16
+ The Mistral 7B - Time Series Predictor is a fine-tuned large language model designed to analyze server performance metrics and forecast potential failures. It processes time-series data and predicts failure probabilities, offering actionable insights for predictive maintenance and operational risk assessment.
17
 
18
  ## Model Details
19
 
20
  ### Model Description
21
 
22
  <!-- Provide a longer summary of what this model is. -->
 
23
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
24
 
25
+ - **Developed by:** Sivakrishna Yaganti and Shankar Jayaratnam
26
+ - **Funded by:** Esperanto Technologies
27
+ - **Model type:** Causal Language Model, fine-tuned for time-series forecasting
28
+ - **Finetuned from model:** Mistral 7B
 
 
 
 
29
 
30
  ### Model Sources [optional]
31
 
 
42
  ### Direct Use
43
 
44
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
45
+ The model can be directly used to:
46
+
47
+ - Forecast server health based on time-series metrics like temperature, power consumption, utilization and throughput.
48
+ - Predict potential causes of failures using historical data.
49
+
50
 
51
  [More Information Needed]
52
 
53
  ### Downstream Use [optional]
54
 
55
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
56
+ The model is ideal for integration into platforms such as Splunk and Grafana to:
57
+ - Monitor server health in real-time.
58
+ - Support decision-making in preventive maintenance.
59
 
60
  [More Information Needed]
61
 
62
  ### Out-of-Scope Use
63
 
64
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
65
+ - This model is not designed for general time-series forecasting outside server health monitoring.
66
+ - It may not perform well on non-server-related data or domains significantly different from its training dataset.
67
 
68
  [More Information Needed]
69
 
70
  ## Bias, Risks, and Limitations
71
 
72
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
73
+ **Bias**:
74
+ 1. Performance may vary on datasets with metrics significantly different from those in the training data.
75
+ 2. Predictions are most accurate when used within the context of server health monitoring.
76
+
77
+ **Risks**
78
+ 1. Relying solely on the model without validating its predictions may result in inaccurate failure forecasts.
79
+ 2. Model outputs are probabilistic and should be interpreted cautiously in critical systems.
80
+
81
+ **Limitations**
82
+ 1. Limited to time-series metrics related to server health (e.g., temperature, power, throughput).
83
+ 2. Performance may degrade for very sparse or noisy datasets.
84
 
85
  [More Information Needed]
86
 
87
  ### Recommendations
88
 
89
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
90
+ **Recommendations**
91
+ 1. Use the model in conjunction with other predictive maintenance tools.
92
+ 2. Validate model predictions against domain knowledge to ensure accuracy.
93
 
94
  ## How to Get Started with the Model
95
 
96
+ [More Information Needed]
97
 
98
+ The Mistral 7B - Time Series Predictor can process time-series queries such as server health metrics and predict failure probabilities and causes. The following Python script demonstrates how to load the model and generate responses.
99
+ ### Code
100
+ - from transformers import AutoModelForCausalLM, AutoTokenizer
101
+ - model_name = "Esperanto/Mistral-7B-TimeSeriesReasoner"
102
+ - tokenizer = AutoTokenizer.from_pretrained(model_name)
103
+ - model = AutoModelForCausalLM.from_pretrained(model_name)
104
 
105
+ *prompt = "What is the failure probability and Cause for Server 'x' on Date : [mm/dd/yy]?"*
106
+
107
+ - input_ids = tokenizer(prompt, return_tensors='pt')['input_ids']
108
+ - output = model.generate(input_ids=input_ids, max_new_tokens=100)
109
+ - response = tokenizer.decode(output[0])
110
+ - print(response)
111
+
112
+
113
+ **Example Prompt**
114
+ - What is the failure probability and Cause for Server 'x' on Date : [mm/dd/yy]?
115
+ - *Expected Ouptut*: The failure probability for ET-1 on 11th July is 0.72. The likely cause is overheating due to sustained high temperatures over the past week.
116
+
117
+ ### Requirements
118
+ #### Dependencies:
119
+ - pip install torch transformers
120
 
121
  ## Training Details
122
 
123
  ### Training Data
124
 
125
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
126
+ **Source:** Synthetic and real-world server metrics from Esperanto servers.
127
+ **Dataset:** Synthetic data generated with periodic patterns (e.g., cosine functions) combined with operational zones (green, yellow, red).
128
 
129
  [More Information Needed]
130
 
 
134
 
135
  #### Preprocessing [optional]
136
 
137
+ ##### Numerical to Textual Conversion:
138
+ All numerical metrics (e.g., temperature, power consumption, throughput) were converted into descriptive textual data to make it comprehensible for the language model. For example:
139
+
140
+ - Numerical Input: {"temperature": [40, 42, 43]}
141
+ - Converted Text: "The temperature increased steadily from 40°C to 43°C over the last three readings."
142
+
143
+ ##### Domain-Specific Context:
144
+ Prompts were carefully designed to incorporate domain knowledge, guiding the model to focus on server health indicators and operational risks.
145
+ - Example prompts include:
146
+ 1. "Analyze the following server performance metrics and predict potential failures."
147
+ 2. "Based on the provided metrics, forecast failure probabilities and identify potential causes."
148
+
149
+ *These prompts ensured the model understood the critical relationships between input metrics and their operational implications.*
150
+
151
  [More Information Needed]
152
 
153
 
 
158
  #### Speeds, Sizes, Times [optional]
159
 
160
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
161
+ - Training time: ~30 hours on NVIDIA A100 GPUs
162
+ - Model size: ~7B parameters
163
+
164
 
165
  [More Information Needed]
166
 
 
173
  #### Testing Data
174
 
175
  <!-- This should link to a Dataset Card if possible. -->
176
+ *Validation set:* 10% of synthetic and real-world server performance data.
177
 
178
  [More Information Needed]
179
 
180
  #### Factors
181
 
182
  <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
183
+ Model evaluated for:
184
+ - Failure prediction accuracy with cause.
185
+
186
+ ### Results
187
+
188
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6659207a17951b5bd11a91fa/UgK2hf8rK9gTw_1AAUuo7.png)
189
+
190
 
191
  [More Information Needed]
192
 
 
230
 
231
  #### Software
232
 
233
+ Use Pytorch, Huggingface transformers library
234
 
235
  ## Citation [optional]
236
 
237
  Esperanto Blog :
238
 
 
239
  ## Model Card Authors [optional]
240
 
241
  Sivakrishna Yaganti and Shankar Jayaratnam
242
 
243
  ## Model Card Contact
244
 
245
246
+
247
  [More Information Needed]