If you eliminate i(t) as done below your problem only depends on 3 parameters instead of 5.

That ought to help no matter what the method is.

restart;
Digits:=15:
eq1:=J*diff(theta(t),t,t)+b*diff(theta(t),t)=K*i(t);
eq2:=L*diff(i(t),t)+R*i(t)=V-K*diff(theta(t),t);
ICs:= theta(0)=0, D(theta)(0) = 0, i(0) = 0;
####
M:=ImportMatrix("F:/MapleDiverse/servo_open_ident_1_A.csv",source=csv):
M[1..10,..]; #Jump in V (col. 2) at start
M[-10..-1,..];
plot(M[..,1..2],labels=[t,V]);
#### For V I shall use this:
V1:=Statistics:-NonlinearFit(a+b*sin(c*t+d),M[..,1..2],t,initialvalues={a=0,b=0.5,c=1/50,d=0});
plot(V1,t=0..50); p1:=%:
plot(M[..,1..2]); p2:=%:
plots:-display(p1,p2);
w1:=unapply(V1,t)~(M[..,1]):
w:=M[..,2]:
dw:=w-w1:
eval(V1,t=0);
dw[1];
(max,min)(dw[2..]); # Not bad
####
plot(M[..,[1,3]],size=[1000,default]); pd:=%: # For later use kept in pd.
plot(M[..,[1,4]],size=[1000,default]); # Isn't used in the following
params:={J= 6.9347e-005, b = 2.2480e-004, K= 0.0094, R=0.2124, L=0.0015}; # Supplied by OP
####From eq1 and ICs follow that (D@@2)(theta)(0)=0. Will be used below.
####Eliminating i(t) by using integrating factor exp(R/L*t) used to solve first order linear odes:
S:=Diff(exp(R/L*t)*i(t),t)=exp(R/L*t)*rhs(eq2)/L;
##Operating on eq1:
map(Diff,exp(R/L*t)*convert(eq1,Diff)/K,t);
##Elimination of i(t):
subs(S,%);
ode:=value(%);
ode1:=L*J*op(solve(ode,{diff(theta(t),t,t,t)}));
ode2:=collect(ode1,diff,factor);
##We see that this third order ode depends on the following groups only:
TR:={c1=(J*R+L*b)/(L*J), c2=(K^2+R*b)/(L*J), c3= K/(L*J)};
cparams:=eval(TR,params); #Op's params translated to c1,c2,c3
sbs:=solve(TR,{J,K,b});
ode3:=subs(sbs,ode2);
solve(ode3,{diff(theta(t),t,t,t)});
ODE:=eval(op(%),V=V1); # The 3rd order ode for theta(t).
##The initial conditions:
ics:=theta(0) = 0,D(theta)(0) = 0,(D@@2)(theta)(0)=0;
res:=dsolve({ODE,ics},numeric,parameters=[c1,c2,c3],output=listprocedure,maxfun=10^6);
th:=subs(res,theta(t));
## Q sets the parameters and computes the sum of squares of the residuals:
Q:=proc(c1,c2,c3) option remember; local j,ssq;
if not [c1,c2,c3]::list(realcons) then return 'procname(_passed)' end if;
try
res(parameters=[c1,c2,c3]);
ssq:=add( (th(M[j,1])-M[j,3])^2,j=1..5001)
catch:
ssq:=10^15
end try;
ssq
end proc:
##
t0:=time[real]():
RES:=Optimization:-NLPSolve(Q(c1,c2,c3),initialpoint=cparams);
time[real]()-t0; # 640s in my case.
##Finished with the error: No improved point could be found.
##But let us look at the remember table for Q:
T:=op(4,eval(Q)):
LL:=[entries(T,pairs)]:
LL:=remove(hastype,LL,{function,HFloat(undefined)}):
LS:=sort(LL,key=rhs):
LS[1];
LS[-1]; # A negative value for c1
res(parameters=([c1,c2,c3]=~[lhs(LS[1])]));
plots:-odeplot(res,[t,theta(t)],0..50,color=blue); pm:=%:
plots:-display(pd,pm,legend=[Data,Model]); #Not bad

The final plot of theta:

The c-values found were: [c1 = 0.07085248546343231, c2 = 1072.44657764141, c3 = 91326.7791941768]

(the result of [c1,c2,c3]=~[lhs(LS[1])]; )