How can I do filtering between two matrix?

up vote
7
down vote

favorite

File1:

91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25

File2:

pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail

As I want to sum up the total fail marks for each row, here is the expected output.

output:

I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.

edited yesterday

Braiam

22.8k1972133

asked yesterday

Owen

524

What is producing these two files and can't that program do this?
– Kusalananda
yesterday

add a comment |

up vote
7
down vote

favorite

File1:

91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25

File2:

pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail

As I want to sum up the total fail marks for each row, here is the expected output.

output:

I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.

edited yesterday

Braiam

22.8k1972133

asked yesterday

Owen

524

What is producing these two files and can't that program do this?
– Kusalananda
yesterday

add a comment |

up vote
7
down vote

favorite

File1:

91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25

File2:

pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail

As I want to sum up the total fail marks for each row, here is the expected output.

output:

I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.

edited yesterday

Braiam

22.8k1972133

asked yesterday

Owen

524

File1:

91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25

File2:

pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail

As I want to sum up the total fail marks for each row, here is the expected output.

output:

I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.

text-processing

edited yesterday

Braiam

22.8k1972133

asked yesterday

Owen

524

edited yesterday

Braiam

22.8k1972133

asked yesterday

Owen

524

edited yesterday

Braiam

22.8k1972133

edited yesterday

Braiam

22.8k1972133

edited yesterday

Braiam

22.8k1972133

asked yesterday

Owen

524

asked yesterday

Owen

524

asked yesterday

Owen

524

What is producing these two files and can't that program do this?
– Kusalananda
yesterday

add a comment |

What is producing these two files and can't that program do this?
– Kusalananda
yesterday

What is producing these two files and can't that program do this?
– Kusalananda
yesterday

add a comment |

7 Answers
7

active

oldest

votes

up vote
4
down vote

accepted

I don't think you need an END section:

awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
 next
 
 T = 0
 for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
 print T
 
' file[12]
100
78
53
91

edited yesterday

answered yesterday

RudiC

2,8081211

You are right, END section is redundant, +1.
– jimmij
yesterday

add a comment |

up vote
10
down vote

I would use a matrix language for such a task, e.g. GNU Octave.

Assuming you converted the pass/fail file into numerical values, e.g.:

sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums

You can now do the following:

marks = dlmread('marks');
passfail = dlmread('passfail.nums');

for i = 1:size(marks)(1)
 sum(marks(i,:)(passfail(i,:) == 0))
end

Output:

ans = 100
ans = 78
ans = 53
ans = 91

answered yesterday

Thor

11.4k13358

add a comment |

up vote
6
down vote

Here is my awk approach:

awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next 
 for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j] 
 ENDfor(k in b) print b[k]' file1 file2

Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:

edited yesterday

answered yesterday

jimmij

30.2k867102

add a comment |

up vote
5
down vote

While I think using awk is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:

for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
 row_score = 0
 for mark, decision in zip(marks.split(), decisions.split()):
 if decision == 'fail':
 row_score += int(mark)
 print(row_score)

which returns the outputs you expected.

answered yesterday

Maxim

1512

New contributor

add a comment |

up vote
2
down vote

I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer

#!/usr/bin/awk -f


FNR == NR 
 for(i=1;i<=NF;i++)
 if ( $i == "fail")
 idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
 next

 delete Array
 delete Line
 i=""
 j=""
 sum=""
 n=split(idxArray[FNR],Array," ")
 l=split($0,Line," ")
 for (i=1;i<=n;i++)
 for (j=1;j<=l;j++)
 if (Array[i] == j )
 sum += Line[j]
 print sum

and run the script as

awk -f script.awk file2 file1

edited yesterday

answered yesterday

Inian

3,448822

add a comment |

up vote
2
down vote

awk '
 BEGIN pf=ARGV[2]; ARGV[2]="" 
 getline l <pf; split(l, a); n=0;
 for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
 print n 
' file1 file2
100
78
53
91

Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.

edited yesterday

answered yesterday

mosvy

4,035221

add a comment |

up vote
0
down vote

One-liner:

paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91

answered 12 hours ago

RudiC

2,8081211

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f480520%2fhow-can-i-do-filtering-between-two-matrix%23new-answer', 'question_page');

);

Post as a guest

Name

7 Answers
7

active

oldest

votes

7 Answers
7

active

oldest

votes

up vote
4
down vote

accepted

I don't think you need an END section:

awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
 next
 
 T = 0
 for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
 print T
 
' file[12]
100
78
53
91

edited yesterday

answered yesterday

RudiC

2,8081211

You are right, END section is redundant, +1.
– jimmij
yesterday

add a comment |

up vote
4
down vote

accepted

I don't think you need an END section:

awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
 next
 
 T = 0
 for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
 print T
 
' file[12]
100
78
53
91

edited yesterday

answered yesterday

RudiC

2,8081211

You are right, END section is redundant, +1.
– jimmij
yesterday

add a comment |

up vote
4
down vote

accepted

I don't think you need an END section:

awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
 next
 
 T = 0
 for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
 print T
 
' file[12]
100
78
53
91

edited yesterday

answered yesterday

RudiC

2,8081211

I don't think you need an END section:

awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
 next
 
 T = 0
 for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
 print T
 
' file[12]
100
78
53
91

edited yesterday

answered yesterday

RudiC

2,8081211

edited yesterday

answered yesterday

RudiC

2,8081211

answered yesterday

RudiC

2,8081211

answered yesterday

RudiC

2,8081211

You are right, END section is redundant, +1.
– jimmij
yesterday

add a comment |

You are right, END section is redundant, +1.
– jimmij
yesterday

You are right, END section is redundant, +1.
– jimmij
yesterday

add a comment |

up vote
10
down vote

I would use a matrix language for such a task, e.g. GNU Octave.

Assuming you converted the pass/fail file into numerical values, e.g.:

sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums

You can now do the following:

marks = dlmread('marks');
passfail = dlmread('passfail.nums');

for i = 1:size(marks)(1)
 sum(marks(i,:)(passfail(i,:) == 0))
end

Output:

ans = 100
ans = 78
ans = 53
ans = 91

answered yesterday

Thor

11.4k13358

add a comment |

up vote
10
down vote

I would use a matrix language for such a task, e.g. GNU Octave.

Assuming you converted the pass/fail file into numerical values, e.g.:

sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums

You can now do the following:

marks = dlmread('marks');
passfail = dlmread('passfail.nums');

for i = 1:size(marks)(1)
 sum(marks(i,:)(passfail(i,:) == 0))
end

Output:

ans = 100
ans = 78
ans = 53
ans = 91

answered yesterday

Thor

11.4k13358

add a comment |

up vote
10
down vote

I would use a matrix language for such a task, e.g. GNU Octave.

Assuming you converted the pass/fail file into numerical values, e.g.:

sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums

You can now do the following:

marks = dlmread('marks');
passfail = dlmread('passfail.nums');

for i = 1:size(marks)(1)
 sum(marks(i,:)(passfail(i,:) == 0))
end

Output:

ans = 100
ans = 78
ans = 53
ans = 91

answered yesterday

Thor

11.4k13358

I would use a matrix language for such a task, e.g. GNU Octave.

Assuming you converted the pass/fail file into numerical values, e.g.:

sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums

You can now do the following:

marks = dlmread('marks');
passfail = dlmread('passfail.nums');

for i = 1:size(marks)(1)
 sum(marks(i,:)(passfail(i,:) == 0))
end

Output:

ans = 100
ans = 78
ans = 53
ans = 91

answered yesterday

Thor

11.4k13358

answered yesterday

Thor

11.4k13358

answered yesterday

Thor

11.4k13358

answered yesterday

Thor

11.4k13358

add a comment |

up vote
6
down vote

Here is my awk approach:

awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next 
 for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j] 
 ENDfor(k in b) print b[k]' file1 file2

Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:

edited yesterday

answered yesterday

jimmij

30.2k867102

add a comment |

up vote
6
down vote

Here is my awk approach:

awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next 
 for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j] 
 ENDfor(k in b) print b[k]' file1 file2

Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:

edited yesterday

answered yesterday

jimmij

30.2k867102

add a comment |

up vote
6
down vote

Here is my awk approach:

awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next 
 for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j] 
 ENDfor(k in b) print b[k]' file1 file2

Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:

edited yesterday

answered yesterday

jimmij

30.2k867102

Here is my awk approach:

awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next 
 for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j] 
 ENDfor(k in b) print b[k]' file1 file2

Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:

edited yesterday

answered yesterday

jimmij

30.2k867102

edited yesterday

answered yesterday

jimmij

30.2k867102

answered yesterday

jimmij

30.2k867102

answered yesterday

jimmij

30.2k867102

add a comment |

up vote
5
down vote

for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
 row_score = 0
 for mark, decision in zip(marks.split(), decisions.split()):
 if decision == 'fail':
 row_score += int(mark)
 print(row_score)

which returns the outputs you expected.

answered yesterday

Maxim

1512

New contributor

add a comment |

up vote
5
down vote

for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
 row_score = 0
 for mark, decision in zip(marks.split(), decisions.split()):
 if decision == 'fail':
 row_score += int(mark)
 print(row_score)

which returns the outputs you expected.

answered yesterday

Maxim

1512

New contributor

add a comment |

up vote
5
down vote

for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
 row_score = 0
 for mark, decision in zip(marks.split(), decisions.split()):
 if decision == 'fail':
 row_score += int(mark)
 print(row_score)

which returns the outputs you expected.

answered yesterday

Maxim

1512

New contributor

for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
 row_score = 0
 for mark, decision in zip(marks.split(), decisions.split()):
 if decision == 'fail':
 row_score += int(mark)
 print(row_score)

which returns the outputs you expected.

answered yesterday

Maxim

1512

New contributor

answered yesterday

Maxim

1512

New contributor

answered yesterday

Maxim

1512

answered yesterday

Maxim

1512

New contributor

Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

up vote
2
down vote

I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer

#!/usr/bin/awk -f


FNR == NR 
 for(i=1;i<=NF;i++)
 if ( $i == "fail")
 idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
 next

 delete Array
 delete Line
 i=""
 j=""
 sum=""
 n=split(idxArray[FNR],Array," ")
 l=split($0,Line," ")
 for (i=1;i<=n;i++)
 for (j=1;j<=l;j++)
 if (Array[i] == j )
 sum += Line[j]
 print sum

and run the script as

awk -f script.awk file2 file1

edited yesterday

answered yesterday

Inian

3,448822

add a comment |

up vote
2
down vote

I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer

#!/usr/bin/awk -f


FNR == NR 
 for(i=1;i<=NF;i++)
 if ( $i == "fail")
 idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
 next

 delete Array
 delete Line
 i=""
 j=""
 sum=""
 n=split(idxArray[FNR],Array," ")
 l=split($0,Line," ")
 for (i=1;i<=n;i++)
 for (j=1;j<=l;j++)
 if (Array[i] == j )
 sum += Line[j]
 print sum

and run the script as

awk -f script.awk file2 file1

edited yesterday

answered yesterday

Inian

3,448822

add a comment |

up vote
2
down vote

I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer

#!/usr/bin/awk -f


FNR == NR 
 for(i=1;i<=NF;i++)
 if ( $i == "fail")
 idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
 next

 delete Array
 delete Line
 i=""
 j=""
 sum=""
 n=split(idxArray[FNR],Array," ")
 l=split($0,Line," ")
 for (i=1;i<=n;i++)
 for (j=1;j<=l;j++)
 if (Array[i] == j )
 sum += Line[j]
 print sum

and run the script as

awk -f script.awk file2 file1

edited yesterday

answered yesterday

Inian

3,448822

I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer

#!/usr/bin/awk -f


FNR == NR 
 for(i=1;i<=NF;i++)
 if ( $i == "fail")
 idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
 next

 delete Array
 delete Line
 i=""
 j=""
 sum=""
 n=split(idxArray[FNR],Array," ")
 l=split($0,Line," ")
 for (i=1;i<=n;i++)
 for (j=1;j<=l;j++)
 if (Array[i] == j )
 sum += Line[j]
 print sum

and run the script as

awk -f script.awk file2 file1

edited yesterday

answered yesterday

Inian

3,448822

edited yesterday

answered yesterday

Inian

3,448822

answered yesterday

Inian

3,448822

answered yesterday

Inian

3,448822

add a comment |

up vote
2
down vote

awk '
 BEGIN pf=ARGV[2]; ARGV[2]="" 
 getline l <pf; split(l, a); n=0;
 for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
 print n 
' file1 file2
100
78
53
91

Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.

edited yesterday

answered yesterday

mosvy

4,035221

add a comment |

up vote
2
down vote

awk '
 BEGIN pf=ARGV[2]; ARGV[2]="" 
 getline l <pf; split(l, a); n=0;
 for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
 print n 
' file1 file2
100
78
53
91

Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.

edited yesterday

answered yesterday

mosvy

4,035221

add a comment |

up vote
2
down vote

awk '
 BEGIN pf=ARGV[2]; ARGV[2]="" 
 getline l <pf; split(l, a); n=0;
 for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
 print n 
' file1 file2
100
78
53
91

Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.

edited yesterday

answered yesterday

mosvy

4,035221

awk '
 BEGIN pf=ARGV[2]; ARGV[2]="" 
 getline l <pf; split(l, a); n=0;
 for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
 print n 
' file1 file2
100
78
53
91

Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.

edited yesterday

answered yesterday

mosvy

4,035221

edited yesterday

answered yesterday

mosvy

4,035221

answered yesterday

mosvy

4,035221

answered yesterday

mosvy

4,035221

add a comment |

up vote
0
down vote

One-liner:

paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91

answered 12 hours ago

RudiC

2,8081211

add a comment |

up vote
0
down vote

One-liner:

paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91

answered 12 hours ago

RudiC

2,8081211

add a comment |

up vote
0
down vote

One-liner:

paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91

answered 12 hours ago

RudiC

2,8081211

One-liner:

paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91

answered 12 hours ago

RudiC

2,8081211

answered 12 hours ago

RudiC

2,8081211

answered 12 hours ago

RudiC

2,8081211

answered 12 hours ago

RudiC

2,8081211

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Pfthb