python curve_fitting with bad results
the link of data from dropboxbadfittingI tried use the curve_fit to fit the data with my pre_defined function in python, but the result was far to perfect. The code is simple and shown as below. I have no idea what's wrong.
Since I am new to python, are there any other optimization or fitting methods which are suitable for my case with predefined function?
Thanks in advance!
import numpy as np
import math
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y
def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x
# test data
f_x= open(r'C:UsersadmDesktopsimpletryfre.txt')
xdata = readdata(f_x)
f_y= open(r'C:UsersadmDesktopsimpletryimpedance.txt')
ydata = readdata(f_y)
xdata = np.array(xdata)
ydata = np.array(ydata)
plt.semilogx(xdata, ydata, 'b-', label='data')
popt, pcov = curve_fit(func, xdata, ydata, bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
plt.semilogx(xdata, func(xdata, *popt), 'r-', label='fitted curve')
print(popt)
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
as you guessed, this is a LCR circuit model. now I am trying to fit two curves with the same parameters like
def func1(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
return m
def func2(x, r1, r2, r3,l,c):
w=2*math.pi*x
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
return n
is it possible to use the curve_fitting to optimize the parameters?
python optimization scipy curve-fitting
add a comment |
the link of data from dropboxbadfittingI tried use the curve_fit to fit the data with my pre_defined function in python, but the result was far to perfect. The code is simple and shown as below. I have no idea what's wrong.
Since I am new to python, are there any other optimization or fitting methods which are suitable for my case with predefined function?
Thanks in advance!
import numpy as np
import math
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y
def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x
# test data
f_x= open(r'C:UsersadmDesktopsimpletryfre.txt')
xdata = readdata(f_x)
f_y= open(r'C:UsersadmDesktopsimpletryimpedance.txt')
ydata = readdata(f_y)
xdata = np.array(xdata)
ydata = np.array(ydata)
plt.semilogx(xdata, ydata, 'b-', label='data')
popt, pcov = curve_fit(func, xdata, ydata, bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
plt.semilogx(xdata, func(xdata, *popt), 'r-', label='fitted curve')
print(popt)
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
as you guessed, this is a LCR circuit model. now I am trying to fit two curves with the same parameters like
def func1(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
return m
def func2(x, r1, r2, r3,l,c):
w=2*math.pi*x
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
return n
is it possible to use the curve_fitting to optimize the parameters?
python optimization scipy curve-fitting
1
Would you please post a link to the data files?
– James Phillips
Nov 12 '18 at 19:56
dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
– hao
Nov 13 '18 at 9:55
Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
– Andrew Jaffe
Nov 13 '18 at 10:02
It looks like the best fit has negative values for some of the parameters, so you should just remove thebounds=...
part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)
– Andrew Jaffe
Nov 13 '18 at 10:23
1
yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
– hao
Nov 13 '18 at 10:54
add a comment |
the link of data from dropboxbadfittingI tried use the curve_fit to fit the data with my pre_defined function in python, but the result was far to perfect. The code is simple and shown as below. I have no idea what's wrong.
Since I am new to python, are there any other optimization or fitting methods which are suitable for my case with predefined function?
Thanks in advance!
import numpy as np
import math
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y
def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x
# test data
f_x= open(r'C:UsersadmDesktopsimpletryfre.txt')
xdata = readdata(f_x)
f_y= open(r'C:UsersadmDesktopsimpletryimpedance.txt')
ydata = readdata(f_y)
xdata = np.array(xdata)
ydata = np.array(ydata)
plt.semilogx(xdata, ydata, 'b-', label='data')
popt, pcov = curve_fit(func, xdata, ydata, bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
plt.semilogx(xdata, func(xdata, *popt), 'r-', label='fitted curve')
print(popt)
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
as you guessed, this is a LCR circuit model. now I am trying to fit two curves with the same parameters like
def func1(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
return m
def func2(x, r1, r2, r3,l,c):
w=2*math.pi*x
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
return n
is it possible to use the curve_fitting to optimize the parameters?
python optimization scipy curve-fitting
the link of data from dropboxbadfittingI tried use the curve_fit to fit the data with my pre_defined function in python, but the result was far to perfect. The code is simple and shown as below. I have no idea what's wrong.
Since I am new to python, are there any other optimization or fitting methods which are suitable for my case with predefined function?
Thanks in advance!
import numpy as np
import math
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y
def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x
# test data
f_x= open(r'C:UsersadmDesktopsimpletryfre.txt')
xdata = readdata(f_x)
f_y= open(r'C:UsersadmDesktopsimpletryimpedance.txt')
ydata = readdata(f_y)
xdata = np.array(xdata)
ydata = np.array(ydata)
plt.semilogx(xdata, ydata, 'b-', label='data')
popt, pcov = curve_fit(func, xdata, ydata, bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
plt.semilogx(xdata, func(xdata, *popt), 'r-', label='fitted curve')
print(popt)
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
as you guessed, this is a LCR circuit model. now I am trying to fit two curves with the same parameters like
def func1(x, r1, r2, r3,l,c):
w=2*math.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
return m
def func2(x, r1, r2, r3,l,c):
w=2*math.pi*x
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
return n
is it possible to use the curve_fitting to optimize the parameters?
python optimization scipy curve-fitting
python optimization scipy curve-fitting
edited Nov 14 '18 at 10:57
hao
asked Nov 12 '18 at 18:53
haohao
163
163
1
Would you please post a link to the data files?
– James Phillips
Nov 12 '18 at 19:56
dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
– hao
Nov 13 '18 at 9:55
Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
– Andrew Jaffe
Nov 13 '18 at 10:02
It looks like the best fit has negative values for some of the parameters, so you should just remove thebounds=...
part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)
– Andrew Jaffe
Nov 13 '18 at 10:23
1
yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
– hao
Nov 13 '18 at 10:54
add a comment |
1
Would you please post a link to the data files?
– James Phillips
Nov 12 '18 at 19:56
dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
– hao
Nov 13 '18 at 9:55
Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
– Andrew Jaffe
Nov 13 '18 at 10:02
It looks like the best fit has negative values for some of the parameters, so you should just remove thebounds=...
part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)
– Andrew Jaffe
Nov 13 '18 at 10:23
1
yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
– hao
Nov 13 '18 at 10:54
1
1
Would you please post a link to the data files?
– James Phillips
Nov 12 '18 at 19:56
Would you please post a link to the data files?
– James Phillips
Nov 12 '18 at 19:56
dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
– hao
Nov 13 '18 at 9:55
dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
– hao
Nov 13 '18 at 9:55
Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
– Andrew Jaffe
Nov 13 '18 at 10:02
Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
– Andrew Jaffe
Nov 13 '18 at 10:02
It looks like the best fit has negative values for some of the parameters, so you should just remove the
bounds=...
part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)– Andrew Jaffe
Nov 13 '18 at 10:23
It looks like the best fit has negative values for some of the parameters, so you should just remove the
bounds=...
part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)– Andrew Jaffe
Nov 13 '18 at 10:23
1
1
yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
– hao
Nov 13 '18 at 10:54
yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
– hao
Nov 13 '18 at 10:54
add a comment |
2 Answers
2
active
oldest
votes
Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:
RMSE: 7.415
R-squared: 0.999995
r1 = 1.16614005e+00
r2 = 2.00000664e+05
r3 = 1.54718886e+01
l = 1.94473531e+04
c = 4.32515535e+05
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall
w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y
def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x
# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)
f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)
xData = numpy.array(xData)
yData = numpy.array(yData)
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()
# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')
# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)
# now the model as a line plot
plt.semilogx(xData, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
thanks for your code. now I am trying to fit two curves with the same parameters like**def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n
is it possible to use the curve_fitting to optimize the parameters?
– hao
Nov 14 '18 at 10:59
Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
– James Phillips
Nov 14 '18 at 16:27
add a comment |
To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.
As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1
which adds a constant to the mix.
Most probably you'll end up in something like the following configuration:
popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])
Which will output a neat-looking flat line, due to m = something big + ~0 + ~0
; n=~0 - ~0
, so y = r1
.
However, if you initialize your parameters somewhat differently,
popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
You will get a better looking fit,
popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])
((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78
P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard
where the first row became headers instead of data. Shouldn't change the overall picture though.
How did you determine these initial parameter estimates?
– James Phillips
Nov 13 '18 at 12:32
By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
– Uvar
Nov 13 '18 at 12:34
I will spend a week writing code to avoid 10 minutes of manual labor...
– James Phillips
Nov 13 '18 at 12:35
I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
– Uvar
Nov 13 '18 at 12:37
The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
– James Phillips
Nov 13 '18 at 12:40
|
show 4 more comments
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268378%2fpython-curve-fitting-with-bad-results%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:
RMSE: 7.415
R-squared: 0.999995
r1 = 1.16614005e+00
r2 = 2.00000664e+05
r3 = 1.54718886e+01
l = 1.94473531e+04
c = 4.32515535e+05
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall
w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y
def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x
# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)
f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)
xData = numpy.array(xData)
yData = numpy.array(yData)
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()
# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')
# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)
# now the model as a line plot
plt.semilogx(xData, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
thanks for your code. now I am trying to fit two curves with the same parameters like**def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n
is it possible to use the curve_fitting to optimize the parameters?
– hao
Nov 14 '18 at 10:59
Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
– James Phillips
Nov 14 '18 at 16:27
add a comment |
Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:
RMSE: 7.415
R-squared: 0.999995
r1 = 1.16614005e+00
r2 = 2.00000664e+05
r3 = 1.54718886e+01
l = 1.94473531e+04
c = 4.32515535e+05
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall
w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y
def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x
# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)
f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)
xData = numpy.array(xData)
yData = numpy.array(yData)
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()
# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')
# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)
# now the model as a line plot
plt.semilogx(xData, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
thanks for your code. now I am trying to fit two curves with the same parameters like**def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n
is it possible to use the curve_fitting to optimize the parameters?
– hao
Nov 14 '18 at 10:59
Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
– James Phillips
Nov 14 '18 at 16:27
add a comment |
Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:
RMSE: 7.415
R-squared: 0.999995
r1 = 1.16614005e+00
r2 = 2.00000664e+05
r3 = 1.54718886e+01
l = 1.94473531e+04
c = 4.32515535e+05
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall
w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y
def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x
# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)
f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)
xData = numpy.array(xData)
yData = numpy.array(yData)
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()
# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')
# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)
# now the model as a line plot
plt.semilogx(xData, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
Here are my results using scipy's differential_evolution genetic algorithm module to generate the initial parameter estimates for curve_fit, along with a simple "brick wall" in the function to ensure all parameters are positive. Scipy's implementation of Differential Evolution uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search - in this example, those bounds are taken from the data maximum and minimum values. My results:
RMSE: 7.415
R-squared: 0.999995
r1 = 1.16614005e+00
r2 = 2.00000664e+05
r3 = 1.54718886e+01
l = 1.94473531e+04
c = 4.32515535e+05
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
def func(x, r1, r2, r3,l,c):
# "brick wall" ensuring all parameters are positive
if r1 < 0.0 or r2 < 0.0 or r3 < 0.0 or l < 0.0 or c < 0.0:
return 1.0E10 # large value gives large error, curve_fit hits a brick wall
w=2*numpy.pi*x
m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2)
n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2)
y= (m**2+n**2)**.5
return y
def readdata(filename):
x = filename.readlines()
x = list(map(lambda s: s.strip(), x))
x = list(map(float, x))
return x
# test data
f_x= open('/home/zunzun/temp/data/fre.txt')
xData = readdata(f_x)
f_y= open('/home/zunzun/temp/data/impedance.txt')
yData = readdata(f_y)
xData = numpy.array(xData)
yData = numpy.array(yData)
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minBound = min(minX, minY)
maxBound = max(maxX, maxY)
parameterBounds =
parameterBounds.append([minBound, maxBound]) # search bounds for r1
parameterBounds.append([minBound, maxBound]) # search bounds for r2
parameterBounds.append([minBound, maxBound]) # search bounds for r3
parameterBounds.append([minBound, maxBound]) # search bounds for l
parameterBounds.append([minBound, maxBound]) # search bounds for c
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()
# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
plt.semilogx(xData, yData, 'D')
# create data for the fitted equation plot
yModel = func(xData, *fittedParameters)
# now the model as a line plot
plt.semilogx(xData, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
edited Nov 13 '18 at 12:33
answered Nov 13 '18 at 12:26
James PhillipsJames Phillips
1,494387
1,494387
thanks for your code. now I am trying to fit two curves with the same parameters like**def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n
is it possible to use the curve_fitting to optimize the parameters?
– hao
Nov 14 '18 at 10:59
Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
– James Phillips
Nov 14 '18 at 16:27
add a comment |
thanks for your code. now I am trying to fit two curves with the same parameters like**def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n
is it possible to use the curve_fitting to optimize the parameters?
– hao
Nov 14 '18 at 10:59
Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
– James Phillips
Nov 14 '18 at 16:27
thanks for your code. now I am trying to fit two curves with the same parameters like
**def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n
is it possible to use the curve_fitting to optimize the parameters?– hao
Nov 14 '18 at 10:59
thanks for your code. now I am trying to fit two curves with the same parameters like
**def func1(x, r1, r2, r3,l,c):** w=2*math.pi*x m=r1+(r2*l*w)/(r2**2+l**2*w**2)+r3/(1+r3*c**2*w**2) return m **def func2(x, r1, r2, r3,l,c):** w=2*math.pi*x n=(r2**2*l*w)/(r2**2+l**2*w**2)-r3**3*c*w/(1+r3*c**2*w**2) return n
is it possible to use the curve_fitting to optimize the parameters?– hao
Nov 14 '18 at 10:59
Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
– James Phillips
Nov 14 '18 at 16:27
Looks to me as if you can simply replace the fitted function in my example with the code in your comment. Note that I used "numpy.pi" as I did not "import math" in my example code.
– James Phillips
Nov 14 '18 at 16:27
add a comment |
To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.
As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1
which adds a constant to the mix.
Most probably you'll end up in something like the following configuration:
popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])
Which will output a neat-looking flat line, due to m = something big + ~0 + ~0
; n=~0 - ~0
, so y = r1
.
However, if you initialize your parameters somewhat differently,
popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
You will get a better looking fit,
popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])
((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78
P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard
where the first row became headers instead of data. Shouldn't change the overall picture though.
How did you determine these initial parameter estimates?
– James Phillips
Nov 13 '18 at 12:32
By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
– Uvar
Nov 13 '18 at 12:34
I will spend a week writing code to avoid 10 minutes of manual labor...
– James Phillips
Nov 13 '18 at 12:35
I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
– Uvar
Nov 13 '18 at 12:37
The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
– James Phillips
Nov 13 '18 at 12:40
|
show 4 more comments
To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.
As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1
which adds a constant to the mix.
Most probably you'll end up in something like the following configuration:
popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])
Which will output a neat-looking flat line, due to m = something big + ~0 + ~0
; n=~0 - ~0
, so y = r1
.
However, if you initialize your parameters somewhat differently,
popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
You will get a better looking fit,
popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])
((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78
P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard
where the first row became headers instead of data. Shouldn't change the overall picture though.
How did you determine these initial parameter estimates?
– James Phillips
Nov 13 '18 at 12:32
By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
– Uvar
Nov 13 '18 at 12:34
I will spend a week writing code to avoid 10 minutes of manual labor...
– James Phillips
Nov 13 '18 at 12:35
I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
– Uvar
Nov 13 '18 at 12:37
The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
– James Phillips
Nov 13 '18 at 12:40
|
show 4 more comments
To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.
As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1
which adds a constant to the mix.
Most probably you'll end up in something like the following configuration:
popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])
Which will output a neat-looking flat line, due to m = something big + ~0 + ~0
; n=~0 - ~0
, so y = r1
.
However, if you initialize your parameters somewhat differently,
popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
You will get a better looking fit,
popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])
((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78
P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard
where the first row became headers instead of data. Shouldn't change the overall picture though.
To have your Least Squares regression make sense, you'll have to at least supply initial parameters which make sense.
As all parameters are by default initiated to the value 1, the biggest influence on the initial regression will be resistor r1
which adds a constant to the mix.
Most probably you'll end up in something like the following configuration:
popt
Out[241]:
array([1.66581563e+03, 2.43663552e+02, 1.13019744e+00, 1.20233767e+00,
5.04984535e-04])
Which will output a neat-looking flat line, due to m = something big + ~0 + ~0
; n=~0 - ~0
, so y = r1
.
However, if you initialize your parameters somewhat differently,
popt, pcov = curve_fit(func, xdata.flatten(), ydata.flatten(), p0=[0.1,1e5,1000,1000,0.2],
bounds=((0, 0, 0, 0, 0), (np.inf, np.inf, np.inf, np.inf, np.inf)))
You will get a better looking fit,
popt
Out[244]:
array([1.14947146e+00, 4.12512324e+05, 1.36182466e+02, 8.29771756e+04,
1.77593448e+03])
((fitted-ydata.flatten())**2).mean()
Out[257]: 0.6099524982664816
#RMSE hence 0.78
P.s. My data starts at the second data point, due to a conversion error with pd.read_clipboard
where the first row became headers instead of data. Shouldn't change the overall picture though.
edited Nov 13 '18 at 12:36
answered Nov 13 '18 at 12:29
UvarUvar
2,519520
2,519520
How did you determine these initial parameter estimates?
– James Phillips
Nov 13 '18 at 12:32
By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
– Uvar
Nov 13 '18 at 12:34
I will spend a week writing code to avoid 10 minutes of manual labor...
– James Phillips
Nov 13 '18 at 12:35
I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
– Uvar
Nov 13 '18 at 12:37
The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
– James Phillips
Nov 13 '18 at 12:40
|
show 4 more comments
How did you determine these initial parameter estimates?
– James Phillips
Nov 13 '18 at 12:32
By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
– Uvar
Nov 13 '18 at 12:34
I will spend a week writing code to avoid 10 minutes of manual labor...
– James Phillips
Nov 13 '18 at 12:35
I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
– Uvar
Nov 13 '18 at 12:37
The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
– James Phillips
Nov 13 '18 at 12:40
How did you determine these initial parameter estimates?
– James Phillips
Nov 13 '18 at 12:32
How did you determine these initial parameter estimates?
– James Phillips
Nov 13 '18 at 12:32
By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
– Uvar
Nov 13 '18 at 12:34
By figuring out it was a RLC circuit-like; roughly figuring C to be smaller than 1, R to be on the order of thousands; from the final part of the curve noting that R1 should be small. Manual labour in this case, I'm afraid @JamesPhillips . And then being wrong about my assumption on C, according to the fit.. :(
– Uvar
Nov 13 '18 at 12:34
I will spend a week writing code to avoid 10 minutes of manual labor...
– James Phillips
Nov 13 '18 at 12:35
I will spend a week writing code to avoid 10 minutes of manual labor...
– James Phillips
Nov 13 '18 at 12:35
I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
– Uvar
Nov 13 '18 at 12:37
I am afraid I would not like to spend my time on writing code which effectively brute-forces an optimal parameter initialization, whereas just picking some numbers; preferably non-uniform; will do the trick. Maybe if I need to reproduce similar thousands of times, but not for a single instance. ;)
– Uvar
Nov 13 '18 at 12:37
The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
– James Phillips
Nov 13 '18 at 12:40
The genetic algorithm in scipy allows a more general solution with wide applicability to many problems - hence my answer.
– James Phillips
Nov 13 '18 at 12:40
|
show 4 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268378%2fpython-curve-fitting-with-bad-results%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Would you please post a link to the data files?
– James Phillips
Nov 12 '18 at 19:56
dropbox.com/s/1cciw4h7yrmc1qx/data.rar?dl=0
– hao
Nov 13 '18 at 9:55
Do you have any idea what the approximate right answer is? You can use that to specify a starting point and bounds that might help the curve-fit routine.
– Andrew Jaffe
Nov 13 '18 at 10:02
It looks like the best fit has negative values for some of the parameters, so you should just remove the
bounds=...
part of the call. (Also, I suspect that there may be some degeneracy in your fit -- i.e., more than one set of parameters fits the data just as well.)– Andrew Jaffe
Nov 13 '18 at 10:23
1
yeah, that's the problem. I had tried to remove the bounds and it can fit well. but the fact is that ALL the parameters need to be positive.
– hao
Nov 13 '18 at 10:54