Basestream.seek

Hi,

I need to read lines backwards in a TXT file with a fixed length for every line.

I'm using

Dim linetxt As String

Dim counter as Integer = -56 'Which is the length of the lines

Dim data_r As New StreamReader(data_path) 'data_path contains full path and name for the file

data_r.BaseStream.Seek(counter, SeekOrigin.End)

linetxt = data_r.ReadLine

And this produces good results for a first reading. But last two lines are inside a loop where the counter is decreased everytime to get the previous line.

Once the first reading is done, subsequents readings are wrong as Seek is moving the pointer in the BaseStream but not in the StreamReader itself.

Some times it looks like the buffer is emptied and then the values are correct.

Is it there some way to synchronize the pointer in the BaseStream and the StreamReader

Regards,

Jose Adell



Answer this question

Basestream.seek

  • Christofer

    Good to see that you find workaround!

    I checked StreamReader, yes, it's really buffer data (my personal opinion it should track if underlying stream change position and update buffer, so we will not have such problems). It also have System.IO.StreamReader.DiscardBufferedData() method that can help you to get rid of buffered data and force reader to read from base stream again.



  • DavidBarracuda

    Hi Sergey

    First at all, thanks for your interest, and about your questions:

    Yes, every line is 56 bytes string. 54 chars and CrLf, total 56. As I told you every first reading is correct, doesn’t matter where I send the pointer, as far is (-56)*i.

    The program reads like a log of a worker, first digit in the line is the last mode used by the operator.

    Those modes can last for hours, so program don’t need to be running doing nothing. When is launched again to change to a new mode it should read the log, backwards, to know when that mode started, compute how long it lasted vs. actual time, etc... and a second reading to know when the working day started (line starting with "0").

    I need it to be as quick as possible, as is running in a PPC and not in a PC, that means much less processor speed. That's why I read backwards, as the last day is recorded at the end of the file.

    I think the problem is about this paragraph in the help system: "StreamReader might buffer input such that the position of the underlying stream will not match the StreamReader position."

    I was aware of this problem when the program is installed first time and the file just have a few lines. So I compared the dates recorded in the first line and the last one to know if there was only one day recorded:

    Dim data_r As New StreamReader(data_path)

    Dim lineinit As String = data_r.ReadLine

    data_r.BaseStream.Seek(-56, SeekOrigin.End)

    Dim linetoday As String = data_r.ReadLine

    You should expect to have the first and the last line of the file, right Instead of that what you get always is the first and second lines of the file. So the Seek operation is not transferred to the StreamReader, may be just to an buffered image of it.

    but if I do:

    Dim data_r As New StreamReader(data_path)

    Dim lineinit As String = data_r.ReadLine

    data_r.Close()

    data_r = New StreamReader(data_path)

    data_r.BaseStream.Seek(-56, SeekOrigin.End)

    Dim linea As String = data_r.ReadLine

    Then I got the results I expected. Further checks about this, changing the Seek point, confirmed me that the reading is always correct if Seek is done before a first reading and never when done after the first reading, but curiously only when going backwards. This is the workaround I found but seems to me not very elegant and consuming time of processor.

    Tried with Position and results were exactly the same.

    Possibly there is a Synchro option (between BaseStream and the underlying StreamReader) but beyond my knowledge at this stage.

    Jose Adell


  • wkbia

    Hi Sergey

    I said in my post the last two lines were inside a loop which decreases the value of counter, exactly the same as your proposal.

    I tried too with Position, instead Seek property, and it produces identical wrong results.

    The main point is that Stream has not ReadLine, that is what I need to do, so I’m using property StreamReader.BaseStream, which clearly is not moving pointers in the original StreamReader.

    Am I missing something

    Jose Adell


  • Cao Yang

    SeekOrigin.End will seek from the end of the stream always. So you always say "go to position -56 from the end of the stream".

    You can do this:

    Stream.Position = Stream.Length - 56*i

    (i is line number)



  • Kishore

    I think it will not degrade performance, because it discards buffer in StreamReader, not base stream or file system buffers. StreamReader keeps not too much of data buffered, so I think discard will be quick enough.

  • Ahmedweb

    Thanks for your style notes.

    SeekOrigin.Current is pointing every time +56 bytes from the last Seek'ing performed, so using it the gap should be -112 to get the previous line.

    Jose Adell


  • Chandrasekar Jayaraj

    The problem is that you use Seek(counter, SeekOrigin.End) and do not change counter at all. You must change it, it must be -56, -56*2, -56*3, -56*4...

    StreamReader do not have internal pointers, it have reference to Stream and read from stream's current position always. Every read operation moves pointers forwards.

    If your file have 56 bytes records, then you may not need to ReadLine() at all, you can simply read to byte[56] array. In this case you can use Stream directly, without reader.

    Add data_r.BaseStream.Position = data_r.BaseStream.Length - 56 * i where i set to 1,2,3,4...



  • RiteshPatel

    I just think of this thing - are you sure you have 56 bytes string Every string ends with one or two invisible characters (new line and carriage return, depends on system), two in Windows. So may be you need not 56, but 58 Last line read correctly because there is end of the stream, but all others won't work. Also if you reading Unicode encoded file, then you must use 58*2.

    BTW, are you sure you need such inverse logic If file is smaller than 1MB than you can effeciently read it string by string into list and reverse it. This would be more natural.



  • Kammy

    Hi Sergey,

    Yes I did read about System.IO.StreamReader.DiscardBufferedData(), but as the buffered data is the whole stream possibly you get nothing after it. Not sure as I didn’t experiment with it. Any way, if the stram is a big file, and all this running in a PDA, sucessive discards and new readings could slow down the process.

    The real curious thing about this issue is that it performs very well when forward, but buggy when backward.

    Any way, has been good to learn a little more about VS 2005.

    Thanks for your help.

    Jose Adell


  • Herve ANCHER

    Hi,

    Well, all is sorted now using FileStream and reading the line into an array and then decoding it.

    Basically:

    Dim data_f As New FileStream(data_path, FileMode.Open, FileAccess.Read)

    Dim buffer(53) As Byte

    Dim encoder As New System.Text.ASCIIEncoding()

    data_f.Read(buffer, 0, 54)

    Dim lineand As String = encoder.GetChars(buffer, 0, 54)

    data_f.Seek(-56, SeekOrigin.End)

    data_f.Read(buffer, 0, 54)

    Dim linea As String = encoder.GetChars(buffer, 0, 54)

    Dim contador As Long = -56

    Do While Microsoft.VisualBasic.Left(lineand, 1) <> "0"

    contador = contador - 56

    data_f.Seek(contador, SeekOrigin.End)

    data_f.Read(buffer, 0, 54)

    lineand = encoder.GetChars(buffer, 0, 54)

    Loop

    This code really moves the pointer and readings are done where are supposed to be.

    Regards,

    Jose Adell


  • NicBar

    More of a stylistic question. Mightn't a For loop from FileStream.Length -1 To 0 Step -56 be clearer If not then maybe it would be more performant to simply seak -56 from SeekOrigin.Current rather that seeking from the End everytime

    You should use assignment operators like -=

    (counter -= 56) more

    Like I said, nothing serious, just stylistic questions about why people code different ways...



  • Fatih Durgut

    Hi Sergey

    I repeat "The last two lines are inside a loop that decreases the value of counter", so I just extracted from my coding the interesting part. The full code is:

    Dim contador As Long = -56

    Do While Microsoft.VisualBasic.Left(lineand, 1) <> "0"

    contador = contador - 56

    'data_r.BaseStream.Seek(contador, SeekOrigin.End)

    lineand = data_r.ReadLine

    Loop

    This produces a first correct reading, but not a second, third, etc...

    I have other similar loops forward working correctly, but the problem arises when the loop is going backward.

    I had a workaround to sort the problem, closing the streamreader instance and creating a new one for every cycle, but didn't like it as not too elegant. I will play a little with your proposal using stream instead streamreader and will let you know.

    Thanks,

    Jose Adell


  • handshake it

    Hi, Antony!

    Seek from the end or seek from the current position is the same thing in pefrormance - they just set pointer to the required position and computations are not so different "position = position - X" vs. "position = length - X".

    About -= operator, I'm C# developer, I use it every time I have a chance to make more clear intentions and make less errors possible. It's really good to use.



  • Basestream.seek