Read the position of the character efficiently
I have following text file. Each data field is separated by |
and line separated by newline
character
|1|data1|data2|....|....|....|n
|2|data2|data3|....|....|....|n
.
.
I want to collect the data fields in between 2nd and 3rd |
symbols. My plan is to find the positions of 2nd |
symbol and read data until 3rd | and then find new line symbol to repeat the same. I heard that we can move cursor curser using lseek function if we have the position. I can read character by character until I find the 2nd and 3rd |
symbol but then I would like to use a faster way to find the new line symbol. What is the most efficient way to do this? Following is my source code
std::string str ("1|data1|data2|....|....|....|n");
std::string str2 ("|");
std::size_t firstpipe = str.find(str2);
std::size_t secondpipe = str.find(str2,secondpipe+1);
if (found!=std::string::npos)
std::cout << "first '|' found at: " << firstpipe << 'n';
std::cout << "scond '|' found at: " << secondpine << 'n';
c++
add a comment |
I have following text file. Each data field is separated by |
and line separated by newline
character
|1|data1|data2|....|....|....|n
|2|data2|data3|....|....|....|n
.
.
I want to collect the data fields in between 2nd and 3rd |
symbols. My plan is to find the positions of 2nd |
symbol and read data until 3rd | and then find new line symbol to repeat the same. I heard that we can move cursor curser using lseek function if we have the position. I can read character by character until I find the 2nd and 3rd |
symbol but then I would like to use a faster way to find the new line symbol. What is the most efficient way to do this? Following is my source code
std::string str ("1|data1|data2|....|....|....|n");
std::string str2 ("|");
std::size_t firstpipe = str.find(str2);
std::size_t secondpipe = str.find(str2,secondpipe+1);
if (found!=std::string::npos)
std::cout << "first '|' found at: " << firstpipe << 'n';
std::cout << "scond '|' found at: " << secondpine << 'n';
c++
2
Much easier to read each record into a std::string using std::getline, and then extract the values from the string. Probably more efficient, too.
– Neil Butterworth
Nov 11 at 17:58
1
Files are streaming devices. They are most efficient when transferring large amounts of data. So your most efficient method would be to read text lines (records), then search memory to get the field you want. You may find that reading many lines (or blocks of data) is more efficient than reading a line at a time. Also, searching in memory is faster than searching on the drive.
– Thomas Matthews
Nov 11 at 18:10
@ThomasMatthews I see your point. I added a piece of code and referring that code. There I used a string. According to you, I should read that string as a block from the file. Is it true? And search memory means I should use memmap ?
– Malintha
Nov 11 at 18:18
@Malintha There is now a discrepancy between your data file example and thestr
variable in the code above. It will have a very off-by-one kind of feeling when you try to debug.
– Bo R
Nov 11 at 18:25
You code doesn't read from a file.
– Thomas Matthews
Nov 11 at 18:28
add a comment |
I have following text file. Each data field is separated by |
and line separated by newline
character
|1|data1|data2|....|....|....|n
|2|data2|data3|....|....|....|n
.
.
I want to collect the data fields in between 2nd and 3rd |
symbols. My plan is to find the positions of 2nd |
symbol and read data until 3rd | and then find new line symbol to repeat the same. I heard that we can move cursor curser using lseek function if we have the position. I can read character by character until I find the 2nd and 3rd |
symbol but then I would like to use a faster way to find the new line symbol. What is the most efficient way to do this? Following is my source code
std::string str ("1|data1|data2|....|....|....|n");
std::string str2 ("|");
std::size_t firstpipe = str.find(str2);
std::size_t secondpipe = str.find(str2,secondpipe+1);
if (found!=std::string::npos)
std::cout << "first '|' found at: " << firstpipe << 'n';
std::cout << "scond '|' found at: " << secondpine << 'n';
c++
I have following text file. Each data field is separated by |
and line separated by newline
character
|1|data1|data2|....|....|....|n
|2|data2|data3|....|....|....|n
.
.
I want to collect the data fields in between 2nd and 3rd |
symbols. My plan is to find the positions of 2nd |
symbol and read data until 3rd | and then find new line symbol to repeat the same. I heard that we can move cursor curser using lseek function if we have the position. I can read character by character until I find the 2nd and 3rd |
symbol but then I would like to use a faster way to find the new line symbol. What is the most efficient way to do this? Following is my source code
std::string str ("1|data1|data2|....|....|....|n");
std::string str2 ("|");
std::size_t firstpipe = str.find(str2);
std::size_t secondpipe = str.find(str2,secondpipe+1);
if (found!=std::string::npos)
std::cout << "first '|' found at: " << firstpipe << 'n';
std::cout << "scond '|' found at: " << secondpine << 'n';
c++
c++
edited Nov 11 at 18:16
asked Nov 11 at 17:52
Malintha
1,30162652
1,30162652
2
Much easier to read each record into a std::string using std::getline, and then extract the values from the string. Probably more efficient, too.
– Neil Butterworth
Nov 11 at 17:58
1
Files are streaming devices. They are most efficient when transferring large amounts of data. So your most efficient method would be to read text lines (records), then search memory to get the field you want. You may find that reading many lines (or blocks of data) is more efficient than reading a line at a time. Also, searching in memory is faster than searching on the drive.
– Thomas Matthews
Nov 11 at 18:10
@ThomasMatthews I see your point. I added a piece of code and referring that code. There I used a string. According to you, I should read that string as a block from the file. Is it true? And search memory means I should use memmap ?
– Malintha
Nov 11 at 18:18
@Malintha There is now a discrepancy between your data file example and thestr
variable in the code above. It will have a very off-by-one kind of feeling when you try to debug.
– Bo R
Nov 11 at 18:25
You code doesn't read from a file.
– Thomas Matthews
Nov 11 at 18:28
add a comment |
2
Much easier to read each record into a std::string using std::getline, and then extract the values from the string. Probably more efficient, too.
– Neil Butterworth
Nov 11 at 17:58
1
Files are streaming devices. They are most efficient when transferring large amounts of data. So your most efficient method would be to read text lines (records), then search memory to get the field you want. You may find that reading many lines (or blocks of data) is more efficient than reading a line at a time. Also, searching in memory is faster than searching on the drive.
– Thomas Matthews
Nov 11 at 18:10
@ThomasMatthews I see your point. I added a piece of code and referring that code. There I used a string. According to you, I should read that string as a block from the file. Is it true? And search memory means I should use memmap ?
– Malintha
Nov 11 at 18:18
@Malintha There is now a discrepancy between your data file example and thestr
variable in the code above. It will have a very off-by-one kind of feeling when you try to debug.
– Bo R
Nov 11 at 18:25
You code doesn't read from a file.
– Thomas Matthews
Nov 11 at 18:28
2
2
Much easier to read each record into a std::string using std::getline, and then extract the values from the string. Probably more efficient, too.
– Neil Butterworth
Nov 11 at 17:58
Much easier to read each record into a std::string using std::getline, and then extract the values from the string. Probably more efficient, too.
– Neil Butterworth
Nov 11 at 17:58
1
1
Files are streaming devices. They are most efficient when transferring large amounts of data. So your most efficient method would be to read text lines (records), then search memory to get the field you want. You may find that reading many lines (or blocks of data) is more efficient than reading a line at a time. Also, searching in memory is faster than searching on the drive.
– Thomas Matthews
Nov 11 at 18:10
Files are streaming devices. They are most efficient when transferring large amounts of data. So your most efficient method would be to read text lines (records), then search memory to get the field you want. You may find that reading many lines (or blocks of data) is more efficient than reading a line at a time. Also, searching in memory is faster than searching on the drive.
– Thomas Matthews
Nov 11 at 18:10
@ThomasMatthews I see your point. I added a piece of code and referring that code. There I used a string. According to you, I should read that string as a block from the file. Is it true? And search memory means I should use memmap ?
– Malintha
Nov 11 at 18:18
@ThomasMatthews I see your point. I added a piece of code and referring that code. There I used a string. According to you, I should read that string as a block from the file. Is it true? And search memory means I should use memmap ?
– Malintha
Nov 11 at 18:18
@Malintha There is now a discrepancy between your data file example and the
str
variable in the code above. It will have a very off-by-one kind of feeling when you try to debug.– Bo R
Nov 11 at 18:25
@Malintha There is now a discrepancy between your data file example and the
str
variable in the code above. It will have a very off-by-one kind of feeling when you try to debug.– Bo R
Nov 11 at 18:25
You code doesn't read from a file.
– Thomas Matthews
Nov 11 at 18:28
You code doesn't read from a file.
– Thomas Matthews
Nov 11 at 18:28
add a comment |
1 Answer
1
active
oldest
votes
In pseudo code:
while( read line with `std::getline` into `std::string`)
find first separator with `std::string::find`
if not found skip line
find second separator with `std::string::find` starting from first separator + 1
if not found skip line
find third separator with `std::string::find` starting from second separator position + 1
use `std::string::substr(secondPos+1,thirdPos-secondPos-1)` to get your datablock.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53251525%2fread-the-position-of-the-character-efficiently%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
In pseudo code:
while( read line with `std::getline` into `std::string`)
find first separator with `std::string::find`
if not found skip line
find second separator with `std::string::find` starting from first separator + 1
if not found skip line
find third separator with `std::string::find` starting from second separator position + 1
use `std::string::substr(secondPos+1,thirdPos-secondPos-1)` to get your datablock.
add a comment |
In pseudo code:
while( read line with `std::getline` into `std::string`)
find first separator with `std::string::find`
if not found skip line
find second separator with `std::string::find` starting from first separator + 1
if not found skip line
find third separator with `std::string::find` starting from second separator position + 1
use `std::string::substr(secondPos+1,thirdPos-secondPos-1)` to get your datablock.
add a comment |
In pseudo code:
while( read line with `std::getline` into `std::string`)
find first separator with `std::string::find`
if not found skip line
find second separator with `std::string::find` starting from first separator + 1
if not found skip line
find third separator with `std::string::find` starting from second separator position + 1
use `std::string::substr(secondPos+1,thirdPos-secondPos-1)` to get your datablock.
In pseudo code:
while( read line with `std::getline` into `std::string`)
find first separator with `std::string::find`
if not found skip line
find second separator with `std::string::find` starting from first separator + 1
if not found skip line
find third separator with `std::string::find` starting from second separator position + 1
use `std::string::substr(secondPos+1,thirdPos-secondPos-1)` to get your datablock.
answered Nov 11 at 18:22
Bo R
616110
616110
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53251525%2fread-the-position-of-the-character-efficiently%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Much easier to read each record into a std::string using std::getline, and then extract the values from the string. Probably more efficient, too.
– Neil Butterworth
Nov 11 at 17:58
1
Files are streaming devices. They are most efficient when transferring large amounts of data. So your most efficient method would be to read text lines (records), then search memory to get the field you want. You may find that reading many lines (or blocks of data) is more efficient than reading a line at a time. Also, searching in memory is faster than searching on the drive.
– Thomas Matthews
Nov 11 at 18:10
@ThomasMatthews I see your point. I added a piece of code and referring that code. There I used a string. According to you, I should read that string as a block from the file. Is it true? And search memory means I should use memmap ?
– Malintha
Nov 11 at 18:18
@Malintha There is now a discrepancy between your data file example and the
str
variable in the code above. It will have a very off-by-one kind of feeling when you try to debug.– Bo R
Nov 11 at 18:25
You code doesn't read from a file.
– Thomas Matthews
Nov 11 at 18:28