Concatenating CSV files in bash preserving the header only once

Imagine I have a directory containing many subdirectories each containing some number of CSV files with the same structure (same number of columns and all containing the same header).

I am aware that I can run from the parent folder something like

find ./ -name '*.csv' -exec cat ; > ~/Desktop/result.csv

And this will work fine, expect for the fact that the header is repeated each time (once for each file).

I'm also aware that I can do something like sed 1d <filename> or tail -n +<N+1> <filename> to skip the first line of a file.

But in my case, it seems a bit more specialised. I want to preserve the header once for the first file and then skip the header for every file after that.

Is anyone aware of a way to achieve this using standard Unix tools (like find, head, tail, sed, awk etc.) and bash?

For example input files

 /folder1
 /file1.csv
 /file2.csv
 /folder2
 /file1.csv

Where each file has header:

A,B,C and each file has one data row 1,2,3

The desired output would be:

A,B,C
1,2,3
1,2,3
1,2,3

Marked As Duplicate

I feel this is different to other questions like this and this specifically because those solutions reference file1 and file2 in the solution. My question asks about a directory structure with an arbitrary number of files where I would not want to type out each file one by one.

edited Nov 15 '18 at 9:27

asked Nov 14 '18 at 19:13

David

1,34741544

1

once for the first file which file is first? Or it makes no difference from which file the header is taken?

– Kamil Cuk
Nov 14 '18 at 19:21

Makes no difference in this case :) all files contain the same header and I don't mind which comes first.

– David
Nov 15 '18 at 9:22

None of the linked questions are exact dup of this problem hence reopening.

– anubhava
Nov 15 '18 at 10:38

add a comment |

Imagine I have a directory containing many subdirectories each containing some number of CSV files with the same structure (same number of columns and all containing the same header).

I am aware that I can run from the parent folder something like

find ./ -name '*.csv' -exec cat ; > ~/Desktop/result.csv

And this will work fine, expect for the fact that the header is repeated each time (once for each file).

I'm also aware that I can do something like sed 1d <filename> or tail -n +<N+1> <filename> to skip the first line of a file.

But in my case, it seems a bit more specialised. I want to preserve the header once for the first file and then skip the header for every file after that.

Is anyone aware of a way to achieve this using standard Unix tools (like find, head, tail, sed, awk etc.) and bash?

For example input files

 /folder1
 /file1.csv
 /file2.csv
 /folder2
 /file1.csv

Where each file has header:

A,B,C and each file has one data row 1,2,3

The desired output would be:

A,B,C
1,2,3
1,2,3
1,2,3

Marked As Duplicate

edited Nov 15 '18 at 9:27

asked Nov 14 '18 at 19:13

David

1,34741544

1

once for the first file which file is first? Or it makes no difference from which file the header is taken?

– Kamil Cuk
Nov 14 '18 at 19:21

Makes no difference in this case :) all files contain the same header and I don't mind which comes first.

– David
Nov 15 '18 at 9:22

None of the linked questions are exact dup of this problem hence reopening.

– anubhava
Nov 15 '18 at 10:38

add a comment |

Imagine I have a directory containing many subdirectories each containing some number of CSV files with the same structure (same number of columns and all containing the same header).

I am aware that I can run from the parent folder something like

find ./ -name '*.csv' -exec cat ; > ~/Desktop/result.csv

And this will work fine, expect for the fact that the header is repeated each time (once for each file).

I'm also aware that I can do something like sed 1d <filename> or tail -n +<N+1> <filename> to skip the first line of a file.

But in my case, it seems a bit more specialised. I want to preserve the header once for the first file and then skip the header for every file after that.

Is anyone aware of a way to achieve this using standard Unix tools (like find, head, tail, sed, awk etc.) and bash?

For example input files

 /folder1
 /file1.csv
 /file2.csv
 /folder2
 /file1.csv

Where each file has header:

A,B,C and each file has one data row 1,2,3

The desired output would be:

A,B,C
1,2,3
1,2,3
1,2,3

Marked As Duplicate

edited Nov 15 '18 at 9:27

asked Nov 14 '18 at 19:13

David

1,34741544

Imagine I have a directory containing many subdirectories each containing some number of CSV files with the same structure (same number of columns and all containing the same header).

I am aware that I can run from the parent folder something like

find ./ -name '*.csv' -exec cat ; > ~/Desktop/result.csv

And this will work fine, expect for the fact that the header is repeated each time (once for each file).

I'm also aware that I can do something like sed 1d <filename> or tail -n +<N+1> <filename> to skip the first line of a file.

But in my case, it seems a bit more specialised. I want to preserve the header once for the first file and then skip the header for every file after that.

Is anyone aware of a way to achieve this using standard Unix tools (like find, head, tail, sed, awk etc.) and bash?

For example input files

 /folder1
 /file1.csv
 /file2.csv
 /folder2
 /file1.csv

Where each file has header:

A,B,C and each file has one data row 1,2,3

The desired output would be:

A,B,C
1,2,3
1,2,3
1,2,3

Marked As Duplicate

bash awk sed cat unix-head

edited Nov 15 '18 at 9:27

asked Nov 14 '18 at 19:13

David

1,34741544

edited Nov 15 '18 at 9:27

asked Nov 14 '18 at 19:13

David

1,34741544

edited Nov 15 '18 at 9:27

asked Nov 14 '18 at 19:13

David

1,34741544

asked Nov 14 '18 at 19:13

David

1,34741544

asked Nov 14 '18 at 19:13

David

1,34741544

1

once for the first file which file is first? Or it makes no difference from which file the header is taken?

– Kamil Cuk
Nov 14 '18 at 19:21

Makes no difference in this case :) all files contain the same header and I don't mind which comes first.

– David
Nov 15 '18 at 9:22

None of the linked questions are exact dup of this problem hence reopening.

– anubhava
Nov 15 '18 at 10:38

add a comment |

1

once for the first file which file is first? Or it makes no difference from which file the header is taken?

– Kamil Cuk
Nov 14 '18 at 19:21

Makes no difference in this case :) all files contain the same header and I don't mind which comes first.

– David
Nov 15 '18 at 9:22

None of the linked questions are exact dup of this problem hence reopening.

– anubhava
Nov 15 '18 at 10:38

once for the first file which file is first? Or it makes no difference from which file the header is taken?

– Kamil Cuk
Nov 14 '18 at 19:21

Makes no difference in this case :) all files contain the same header and I don't mind which comes first.

– David
Nov 15 '18 at 9:22

None of the linked questions are exact dup of this problem hence reopening.

– anubhava
Nov 15 '18 at 10:38

add a comment |

2 Answers
2

active

oldest

votes

You may use this find + xargs + awk:

find . -name '*.csv' -print0 | xargs -0 awk 'NR==1 || FNR>1'

NR==1 || FNR>1 condition will be true for very first line in combined output or for every non-first line.

answered Nov 14 '18 at 19:22

anubhava

532k47330408

1

This assumes that xargs can process all the files using a single call to awk.

– chepner
Nov 14 '18 at 20:12

Yes that's right.

– anubhava
Nov 14 '18 at 20:17

add a comment |

$ 
> cat real-daily-wages-in-pounds-engla.tsv;
> tail -n+2 real-daily-wages-in-pounds-engla.tsv;
> | cat

You can pipe the output of multiple commands through cat. tail -n+2 selects all lines from a file, except the first.

answered Nov 14 '18 at 19:21

Mark

4819

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53307255%2fconcatenating-csv-files-in-bash-preserving-the-header-only-once%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

You may use this find + xargs + awk:

find . -name '*.csv' -print0 | xargs -0 awk 'NR==1 || FNR>1'

NR==1 || FNR>1 condition will be true for very first line in combined output or for every non-first line.

answered Nov 14 '18 at 19:22

anubhava

532k47330408

1

This assumes that xargs can process all the files using a single call to awk.

– chepner
Nov 14 '18 at 20:12

Yes that's right.

– anubhava
Nov 14 '18 at 20:17

add a comment |

You may use this find + xargs + awk:

find . -name '*.csv' -print0 | xargs -0 awk 'NR==1 || FNR>1'

NR==1 || FNR>1 condition will be true for very first line in combined output or for every non-first line.

answered Nov 14 '18 at 19:22

anubhava

532k47330408

1

This assumes that xargs can process all the files using a single call to awk.

– chepner
Nov 14 '18 at 20:12

Yes that's right.

– anubhava
Nov 14 '18 at 20:17

add a comment |

You may use this find + xargs + awk:

find . -name '*.csv' -print0 | xargs -0 awk 'NR==1 || FNR>1'

NR==1 || FNR>1 condition will be true for very first line in combined output or for every non-first line.

answered Nov 14 '18 at 19:22

anubhava

532k47330408

You may use this find + xargs + awk:

find . -name '*.csv' -print0 | xargs -0 awk 'NR==1 || FNR>1'

NR==1 || FNR>1 condition will be true for very first line in combined output or for every non-first line.

answered Nov 14 '18 at 19:22

anubhava

532k47330408

answered Nov 14 '18 at 19:22

anubhava

532k47330408

answered Nov 14 '18 at 19:22

anubhava

532k47330408

answered Nov 14 '18 at 19:22

anubhava

532k47330408

1

This assumes that xargs can process all the files using a single call to awk.

– chepner
Nov 14 '18 at 20:12

Yes that's right.

– anubhava
Nov 14 '18 at 20:17

add a comment |

1

This assumes that xargs can process all the files using a single call to awk.

– chepner
Nov 14 '18 at 20:12

Yes that's right.

– anubhava
Nov 14 '18 at 20:17

This assumes that xargs can process all the files using a single call to awk.

– chepner
Nov 14 '18 at 20:12

Yes that's right.

– anubhava
Nov 14 '18 at 20:17

add a comment |

$ 
> cat real-daily-wages-in-pounds-engla.tsv;
> tail -n+2 real-daily-wages-in-pounds-engla.tsv;
> | cat

You can pipe the output of multiple commands through cat. tail -n+2 selects all lines from a file, except the first.

answered Nov 14 '18 at 19:21

Mark

4819

add a comment |

$ 
> cat real-daily-wages-in-pounds-engla.tsv;
> tail -n+2 real-daily-wages-in-pounds-engla.tsv;
> | cat

You can pipe the output of multiple commands through cat. tail -n+2 selects all lines from a file, except the first.

answered Nov 14 '18 at 19:21

Mark

4819

add a comment |

$ 
> cat real-daily-wages-in-pounds-engla.tsv;
> tail -n+2 real-daily-wages-in-pounds-engla.tsv;
> | cat

You can pipe the output of multiple commands through cat. tail -n+2 selects all lines from a file, except the first.

answered Nov 14 '18 at 19:21

Mark

4819

$ 
> cat real-daily-wages-in-pounds-engla.tsv;
> tail -n+2 real-daily-wages-in-pounds-engla.tsv;
> | cat

You can pipe the output of multiple commands through cat. tail -n+2 selects all lines from a file, except the first.

answered Nov 14 '18 at 19:21

Mark

4819

answered Nov 14 '18 at 19:21

Mark

4819

answered Nov 14 '18 at 19:21

Mark

4819

answered Nov 14 '18 at 19:21

Mark

4819

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

g5duLxxYnrfQ7a1s rznBi1 H3H1

搜尋此網誌

Pfthb

Concatenating CSV files in bash preserving the header only once

Marked As Duplicate

Marked As Duplicate

Marked As Duplicate

Marked As Duplicate

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

Use pre created SQLite database for Android project in kotlin

Ruanda

Ondo

Concatenating CSV files in bash preserving the header only once

Marked As Duplicate

Marked As Duplicate

Marked As Duplicate

Marked As Duplicate

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Use pre created SQLite database for Android project in kotlin

Ruanda

Ondo

2 Answers
2

2 Answers
2

2 Answers
2