How can I do filtering between two matrix?
up vote
7
down vote
favorite
File1:
91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25
File2:
pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail
As I want to sum up the total fail marks for each row, here is the expected output.
output:
100
78
53
91
I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.
text-processing
add a comment |
up vote
7
down vote
favorite
File1:
91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25
File2:
pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail
As I want to sum up the total fail marks for each row, here is the expected output.
output:
100
78
53
91
I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.
text-processing
What is producing these two files and can't that program do this?
– Kusalananda
yesterday
add a comment |
up vote
7
down vote
favorite
up vote
7
down vote
favorite
File1:
91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25
File2:
pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail
As I want to sum up the total fail marks for each row, here is the expected output.
output:
100
78
53
91
I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.
text-processing
File1:
91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25
File2:
pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail
As I want to sum up the total fail marks for each row, here is the expected output.
output:
100
78
53
91
I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.
text-processing
text-processing
edited yesterday
Braiam
22.8k1972133
22.8k1972133
asked yesterday
Owen
524
524
What is producing these two files and can't that program do this?
– Kusalananda
yesterday
add a comment |
What is producing these two files and can't that program do this?
– Kusalananda
yesterday
What is producing these two files and can't that program do this?
– Kusalananda
yesterday
What is producing these two files and can't that program do this?
– Kusalananda
yesterday
add a comment |
7 Answers
7
active
oldest
votes
up vote
4
down vote
accepted
I don't think you need an END section:
awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
next
T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
' file[12]
100
78
53
91
You are right, END section is redundant, +1.
– jimmij
yesterday
add a comment |
up vote
10
down vote
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
add a comment |
up vote
6
down vote
Here is my awk approach:
awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next
for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]
ENDfor(k in b) print b[k]' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
add a comment |
up vote
5
down vote
While I think using awk is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
New contributor
Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
up vote
2
down vote
I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
and run the script as
awk -f script.awk file2 file1
add a comment |
up vote
2
down vote
awk '
BEGIN pf=ARGV[2]; ARGV[2]=""
getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
add a comment |
up vote
0
down vote
One-liner:
paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91
add a comment |
7 Answers
7
active
oldest
votes
7 Answers
7
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
accepted
I don't think you need an END section:
awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
next
T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
' file[12]
100
78
53
91
You are right, END section is redundant, +1.
– jimmij
yesterday
add a comment |
up vote
4
down vote
accepted
I don't think you need an END section:
awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
next
T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
' file[12]
100
78
53
91
You are right, END section is redundant, +1.
– jimmij
yesterday
add a comment |
up vote
4
down vote
accepted
up vote
4
down vote
accepted
I don't think you need an END section:
awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
next
T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
' file[12]
100
78
53
91
I don't think you need an END section:
awk '
NR == FNR for (i=1; i<=NF; i++) F[i,NR] = $i
next
T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
' file[12]
100
78
53
91
edited yesterday
answered yesterday
RudiC
2,8081211
2,8081211
You are right, END section is redundant, +1.
– jimmij
yesterday
add a comment |
You are right, END section is redundant, +1.
– jimmij
yesterday
You are right, END section is redundant, +1.
– jimmij
yesterday
You are right, END section is redundant, +1.
– jimmij
yesterday
add a comment |
up vote
10
down vote
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
add a comment |
up vote
10
down vote
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
add a comment |
up vote
10
down vote
up vote
10
down vote
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
answered yesterday
Thor
11.4k13358
11.4k13358
add a comment |
add a comment |
up vote
6
down vote
Here is my awk approach:
awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next
for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]
ENDfor(k in b) print b[k]' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
add a comment |
up vote
6
down vote
Here is my awk approach:
awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next
for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]
ENDfor(k in b) print b[k]' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
add a comment |
up vote
6
down vote
up vote
6
down vote
Here is my awk approach:
awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next
for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]
ENDfor(k in b) print b[k]' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
Here is my awk approach:
awk 'NR==FNRfor(i=1;i<=NF;i++) a[NR"-"i]=$i; next
for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]
ENDfor(k in b) print b[k]' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
edited yesterday
answered yesterday
jimmij
30.2k867102
30.2k867102
add a comment |
add a comment |
up vote
5
down vote
While I think using awk is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
New contributor
Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
up vote
5
down vote
While I think using awk is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
New contributor
Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
up vote
5
down vote
up vote
5
down vote
While I think using awk is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
New contributor
Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
While I think using awk is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
New contributor
Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered yesterday
Maxim
1512
1512
New contributor
Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Maxim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
up vote
2
down vote
I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
and run the script as
awk -f script.awk file2 file1
add a comment |
up vote
2
down vote
I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
and run the script as
awk -f script.awk file2 file1
add a comment |
up vote
2
down vote
up vote
2
down vote
I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
and run the script as
awk -f script.awk file2 file1
I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
and run the script as
awk -f script.awk file2 file1
edited yesterday
answered yesterday
Inian
3,448822
3,448822
add a comment |
add a comment |
up vote
2
down vote
awk '
BEGIN pf=ARGV[2]; ARGV[2]=""
getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
add a comment |
up vote
2
down vote
awk '
BEGIN pf=ARGV[2]; ARGV[2]=""
getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
add a comment |
up vote
2
down vote
up vote
2
down vote
awk '
BEGIN pf=ARGV[2]; ARGV[2]=""
getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
awk '
BEGIN pf=ARGV[2]; ARGV[2]=""
getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
edited yesterday
answered yesterday
mosvy
4,035221
4,035221
add a comment |
add a comment |
up vote
0
down vote
One-liner:
paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91
add a comment |
up vote
0
down vote
One-liner:
paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91
add a comment |
up vote
0
down vote
up vote
0
down vote
One-liner:
paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91
One-liner:
paste file[12] | awk 'T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T'
100
78
53
91
answered 12 hours ago
RudiC
2,8081211
2,8081211
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f480520%2fhow-can-i-do-filtering-between-two-matrix%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
What is producing these two files and can't that program do this?
– Kusalananda
yesterday