Task Scenario Description
The latest information needs to be extracted from a logging file while the application is running.
Limitations are listed as below:
- The target platform is a subset of QNX
- The size of logging file is over 100M.
- The shell(ksh) commands are not complete, eg.
split,tac
Solution
#!/bin/ksh
log_size=0
log_size=`du -k logfile_name | awk '{print $1}'`
echo "The size of logfile_name is $log_size k"
skip_size=$(($log_size - 64))
dd if=logfile_name bs=1k count=64 skip=$skip_size of=log_tmp
grep 'keyword' log_tmp | awk 'NF{a=$0}END{print a}'
rm log_tmp
In order to reduce the response time, I do not want to search the whole file.
So the most efficient way seems to look up from the end of file.
However the environment does not have a reverse output command like tac.
What appears in my mind is a temp file, the environment frustrate me again: no split.
Fortunately, I find dd, an uncommon and old command, compared to df and du. Here just summarize its usage.
mnemonics:
``dd`` : disk divid
``du`` : disk used
``df`` : disk free
PS: The name dd may be an allusion to the DD statement found in IBM's Job Control Language (JCL), where the acronym stands for "Data Description".
Syntax
dd [Options]
Key
if=FILE
Input file : Read from FILE instead of standard input.
of=FILE
Output file : Write to FILE instead of standard output. Unless `conv=notrunc'
is given, `dd' truncates FILE to zero bytes (or the size specified
with `seek=').
ibs=BYTES
Read BYTES bytes at a time.
obs=BYTES
Write BYTES bytes at a time.
bs=BYTES
Block size, both read and write BYTES bytes at a time. This overrides `ibs'
and `obs'.
skip=BLOCKS
Skip BLOCKS `ibs'-byte blocks in the input file before copying.
count=BLOCKS
Copy BLOCKS `ibs'-byte blocks from the input file, instead of
everything until the end of the file.
The numeric-valued options (BYTES and BLOCKS) can be followed by a multiplier: b=512, c=1, w=2, xM=M, or any of the standard block size suffixes like `k'=1024.
xM=M seems be confusing, M does not refers to megabyte. Actually, it is a multiplier. If you want to use 1 Megabyte, you could let bs=1024x1024.
For me, the option skip is the key, which allows me to only truncate the last part of the logging file. If using split, I have to wait until the file is divided to sevearl pieces.