cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Log Analytics filtering

Zoltan.Gutleber
Builder

I have Log Analytics deployed through the agent machine using JOBs and I parse it through the grok expression. However, I noticed that I also receive data in the database that clearly do not match, which means that they do not have an ERROR logLevel.
Which I don't want to parse into columns, but I don't even want to have them in the database due to capacity.

grok patterns:
- "%{TIMESTAMP_ISO8601:logEventTimestamp}%{SPACE}\\[%{NUMBER:logLevelId}\\]%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}-%{SPACE}%{GREEDYDATA:msg}"

pattern.grok:
LOGLEVEL ([Ee]rr?(?:or)?|ERR?(?:OR)?)

Requered data:

ZoltanGutleber_0-1708695390500.png

Unnecessary data: 

ZoltanGutleber_1-1708695588668.png

I would be interested in how to get rid of them, or where they can be used clause where or a filter?

 

2 REPLIES 2

Zoltan.Gutleber
Builder

This answer came to me from support:
 

  1. You can not select specific lines only to be published to log analytics as analytics agent will read the entire file configured under log analytics source rule with all its contents and send those to ES as log events. 
  2. So its totally based on how many log files you are monitoring and how big they are that will decide how much disk space log analytics data will consume.; this can not be controller manually.
  3. Now once data is in ES, you may use regex and grok patterns for field extraction, however each raw message lines which analytics agent reads will be published to the events service as mentioned in point 1.
  4. Further in order to see only certain data, you can make use of ADQL filters and print only the interested ones.
  5. As all the data which was present in your log files is now with ES, those are stored in various shards based on their timestamp and spacing.
  6. How much data ES will keep is dependent on your retention period. So if your analytics retention is (say) 90 days( which is per your license units and the retention configuration that you have set in controller), all the data will be stored for at least 90 days and when those indices expire, they will be automatically deleted form backend.
  7.  However if that entire data is too much as your ES is right now just a single node and doe snot have enough resources to store all this huge amount of data that you are sending, you may choose to delete older data and keep data for lesser duration, like only 30 days or only 10 days or 8 days which is the minimum retention period.
  8. This does not mean you delete all the log analytics since 90 days data or keep all, rather its more like you can choose lesser retention for data to be stored in ES and delete data which is old so they don't occupy space unnecessarily.

Now Regarding "Having so many resources allocated just for extracting errors from logs does not seem like the right way to me." 
 
None of the suggested recommendations was to fetch only ERROR data from logs, as it is clearly mentioned that this can not be done per the product design. The recommendations however were for how in this scenario when we can't control what comes to ES from your log files, can we still manage your data and space nicely so that you get the useful data and discard extra data to have not to worry about using more disk space on this host.
 
Regarding "Alternatively, could you recommend me how to select only errors from the log files?". This is already answered in point 1.

Ryan.Paredez
Community Manager

Hi @Zoltan.Gutleber,

Thanks so much for coming back and sharing the solution!


Thanks,

Ryan, Cisco AppDynamics Community Manager




Found something helpful? Click the Accept as Solution button to help others find answers faster.

Liked something? Click the Thumbs Up button.



Check out Observabiity in Action

new deep dive videos weekly in the Knowledge Base.