OpenSeesSP freezes
Moderator: selimgunay
OpenSeesSP freezes
dear all,
I have problem running scripts with OpenSeesSP.
I tried with "OpenSees Example 5. 2D Frame, 3-story 3-bay" example.
link : http://opensees.berkeley.edu/wiki/index ... _W-Section
here are some modifications I've made. it was necessary to run this script with OpenSeesSP.
1. changed "system BandGeneral" to "system Mumps" in "Ex5.Frame2D.build.InelasticFiberWSection.tcl".
2. also removed system declaration term in "LibAnalysisStaticParameters.tcl".
(it seems like duplicated declaration of "system Mumps" results in freezing issue.)
3. removed OpenSeesSP incompatible recorders (e.g., drift recorder )
4. modified 6-dof lateral loads to 3-dof. (Ex5.Frame2D.analyze.Static.Push.tcl)
5. modified "source GeneratePeaks.tcl" to "source LibGeneratePeaks.tcl" in "Ex5.Frame2D.analyze.Static.Cycle.tcl"
execution command : mpiexec -n 4 OpenSeesSP runCyclic.tcl
in the "runCyclic.tcl" :
source Ex5.Frame2D.build.InelasticFiberWSection.tcl
source Ex5.Frame2D.analyze.Static.Cycle.tcl
these errors when I use Mumps :
WARNING MumpsParallelSolver::setSize(void)- Error -3 returned in substitution dmumps()
WARNING:MumpsParallelSOE::setSize : solver failed setSize()
StaticAnalysis::handle() - LinearSOE::setSize() failedStaticAnalysis::analyze() - domainChanged failed at step 0 of 1
OpenSees > analyze failed, returned: -1 error flag
Trying Newton with Initial Tangent ..
although four OpenSeesSP process was running, the analysis won't progress...
after several minutes, MPIEXEC times out.
and these are part of error messages when I use SparseGEN instead of Mumps :
DistributedSuperLU::DistributedSuperLU()
DistributedSuperLU::recvSelf(int cTag, Channel &theChannel) - START
DistributedSuperLU::recvSelf(int cTag, Channel &theChannel) - END
DistributedSuperLU::sendSelf(int cTag, Channel &theChannel) - 5
DistributedSuperLU::sendSelf(int cTag, Channel &theChannel) - 5
Fatal error in PMPI_Comm_create: Other MPI error, error stack:
PMPI_Comm_create(609).........: MPI_Comm_create(comm=0xc40300f8, group=0xc80100f8, new_comm=0202E3CC) failed
PMPI_Comm_create(590).........:
MPIR_Comm_create_intra(250)...:
MPIR_Get_contextid(521).......:
MPIR_Get_contextid_sparse(752): Too many communicators
my system : Windows 10 Pro with intel Skylake CPU. (with OpenSeesSP 2.5.0)
any help would be grateful.
regards.
I have problem running scripts with OpenSeesSP.
I tried with "OpenSees Example 5. 2D Frame, 3-story 3-bay" example.
link : http://opensees.berkeley.edu/wiki/index ... _W-Section
here are some modifications I've made. it was necessary to run this script with OpenSeesSP.
1. changed "system BandGeneral" to "system Mumps" in "Ex5.Frame2D.build.InelasticFiberWSection.tcl".
2. also removed system declaration term in "LibAnalysisStaticParameters.tcl".
(it seems like duplicated declaration of "system Mumps" results in freezing issue.)
3. removed OpenSeesSP incompatible recorders (e.g., drift recorder )
4. modified 6-dof lateral loads to 3-dof. (Ex5.Frame2D.analyze.Static.Push.tcl)
5. modified "source GeneratePeaks.tcl" to "source LibGeneratePeaks.tcl" in "Ex5.Frame2D.analyze.Static.Cycle.tcl"
execution command : mpiexec -n 4 OpenSeesSP runCyclic.tcl
in the "runCyclic.tcl" :
source Ex5.Frame2D.build.InelasticFiberWSection.tcl
source Ex5.Frame2D.analyze.Static.Cycle.tcl
these errors when I use Mumps :
WARNING MumpsParallelSolver::setSize(void)- Error -3 returned in substitution dmumps()
WARNING:MumpsParallelSOE::setSize : solver failed setSize()
StaticAnalysis::handle() - LinearSOE::setSize() failedStaticAnalysis::analyze() - domainChanged failed at step 0 of 1
OpenSees > analyze failed, returned: -1 error flag
Trying Newton with Initial Tangent ..
although four OpenSeesSP process was running, the analysis won't progress...
after several minutes, MPIEXEC times out.
and these are part of error messages when I use SparseGEN instead of Mumps :
DistributedSuperLU::DistributedSuperLU()
DistributedSuperLU::recvSelf(int cTag, Channel &theChannel) - START
DistributedSuperLU::recvSelf(int cTag, Channel &theChannel) - END
DistributedSuperLU::sendSelf(int cTag, Channel &theChannel) - 5
DistributedSuperLU::sendSelf(int cTag, Channel &theChannel) - 5
Fatal error in PMPI_Comm_create: Other MPI error, error stack:
PMPI_Comm_create(609).........: MPI_Comm_create(comm=0xc40300f8, group=0xc80100f8, new_comm=0202E3CC) failed
PMPI_Comm_create(590).........:
MPIR_Comm_create_intra(250)...:
MPIR_Get_contextid(521).......:
MPIR_Get_contextid_sparse(752): Too many communicators
my system : Windows 10 Pro with intel Skylake CPU. (with OpenSeesSP 2.5.0)
any help would be grateful.
regards.
Re: OpenSeesSP freezes
the problem is resolved.
it looks like "system Mumps" command and "analysis Static" command shouldn't be declared more than once...
and after this problem is resolved, another problem took place.
I have no idea why, but "system Mumps" command increases analysis time...
below are cyclic analysis comparisons. [example 5] is modified to repeat 10 times per cycle.
OpenSees (1 node) + "system BandGeneral" : 9252 milliseconds
OpenSeesSP (4 nodes) + "system BandGeneral" : 5185 milliseconds
OpenSeesSP (4 nodes) + "system Mumps" : 13072 milliseconds
question 1. why is "system Mumps" slows down OpenSeesSP?
question 2. how come "system BandGeneral" can solve equations in parallel? it is not written in the manual...
any help would be grateful.
it looks like "system Mumps" command and "analysis Static" command shouldn't be declared more than once...
and after this problem is resolved, another problem took place.
I have no idea why, but "system Mumps" command increases analysis time...
below are cyclic analysis comparisons. [example 5] is modified to repeat 10 times per cycle.
OpenSees (1 node) + "system BandGeneral" : 9252 milliseconds
OpenSeesSP (4 nodes) + "system BandGeneral" : 5185 milliseconds
OpenSeesSP (4 nodes) + "system Mumps" : 13072 milliseconds
question 1. why is "system Mumps" slows down OpenSeesSP?
question 2. how come "system BandGeneral" can solve equations in parallel? it is not written in the manual...
any help would be grateful.
Re: OpenSeesSP freezes
1.for small problems or problems in which the matrix has a narrow band with a lot of fill in (case of 2d tall building) there is the potential for Mumps to be slower, however it is the
first case i have seen were mumps is slower than BandGenereal. run OpenSeesSP with 1 processor and Mumps.
2. the solving is being done sequentially on processor P0. the big differeence between 1 and 4 is either due to slow element state determination or page faults on P0 when only using 1 processor.
first case i have seen were mumps is slower than BandGenereal. run OpenSeesSP with 1 processor and Mumps.
2. the solving is being done sequentially on processor P0. the big differeence between 1 and 4 is either due to slow element state determination or page faults on P0 when only using 1 processor.
Re: OpenSeesSP freezes
Dear fmk,
thank you very much for your reply!
I re-run the analysis to see the differences as you suggested.
OpenSees (1 node) + "system BandGeneral" : 8857 milliseconds
OpenSees (1 node) + "system Mumps" : 12644 milliseconds
OpenSeesSP (4 nodes) + "system BandGeneral" : 4968 milliseconds
OpenSeesSP (4 nodes) + "system Mumps" : 12338 milliseconds
If slow analysis of Mumps is due to the stiffness matrix, I would try other examples.
and I would like to ask you some questions.
1. what are "page faults"?
2. for the case of using OpenSeesMP, what is the best way to assign elements to cores? (substructuring?)
assume a building with 8 floors and 4 spans
method 1 - assign elements horizontally
floor 1 to 4 : assign to core 0
floor 5 to 8 : assign to core 1
method 2 - assign elements vertically
column span 1 to 2 : assign to core 0
column span 3 to 4 : assign to core 1
3. Is there any way for mpiexec.exe to stay in memory?
every time I type "mpiexec -n 4 ~~~" in command prompt (or MATLAB), it launches mpiexec and exits when OpenSees terminates.
this launching and exiting process takes approximately 5 to 10 seconds.
It is so time wasting when performing repeated batch analysis.
thank you very much for your reply!
I re-run the analysis to see the differences as you suggested.
OpenSees (1 node) + "system BandGeneral" : 8857 milliseconds
OpenSees (1 node) + "system Mumps" : 12644 milliseconds
OpenSeesSP (4 nodes) + "system BandGeneral" : 4968 milliseconds
OpenSeesSP (4 nodes) + "system Mumps" : 12338 milliseconds
If slow analysis of Mumps is due to the stiffness matrix, I would try other examples.
and I would like to ask you some questions.
1. what are "page faults"?
2. for the case of using OpenSeesMP, what is the best way to assign elements to cores? (substructuring?)
assume a building with 8 floors and 4 spans
method 1 - assign elements horizontally
floor 1 to 4 : assign to core 0
floor 5 to 8 : assign to core 1
method 2 - assign elements vertically
column span 1 to 2 : assign to core 0
column span 3 to 4 : assign to core 1
3. Is there any way for mpiexec.exe to stay in memory?
every time I type "mpiexec -n 4 ~~~" in command prompt (or MATLAB), it launches mpiexec and exits when OpenSees terminates.
this launching and exiting process takes approximately 5 to 10 seconds.
It is so time wasting when performing repeated batch analysis.
Re: OpenSeesSP freezes
1. page faults. a program when running has access to an awful lot of memory, way more than the RAM on the computer. The applications memory is termed virtual memory as opposed to the physical memory. Magic stuff in the hardware and OS map the virtual address space to the physical memory. To keep things easy for the computer the mapping is done in chunks of memory or pages. For an application that uses a small amount of memory all the pages of the running application are kept in the physical memory. For larger memory hogs this is not possible, as such the data in certain pages must be stored on the hard drive. When the application needs one of these stored pages the application comes to a grinding halt and the operating system is called upon to send another page to disk and then read the old page from disk. Disk access is SLOOOOOWWWWWWWW when comparedto RAM access.
2. as you describe. if more nonlinearity is exptected at certain parts of the building assign fewer elements to those processors and more elsewhere.
3.not sure if i understand what you areasking, but i never use the cmd to launch mpi. learn to launch from the terminal application/Dos prompt. if you use the terminal application set the locationof the mpiexecandOpenSees inthe PATH env variable
if you have batch runs to do it might be more efficient to be running multiple sequential jobs, e.g. if you can run 16 jobs at once (assuming memory is not our problem) you may obtain a speedup of about 16 versus what you are seeing now. what slows this perfectly scalable batch processing approach down is disk writes (if all 16 writing to a common disk the process can be painfull if you are writing lots of data).
2. as you describe. if more nonlinearity is exptected at certain parts of the building assign fewer elements to those processors and more elsewhere.
3.not sure if i understand what you areasking, but i never use the cmd to launch mpi. learn to launch from the terminal application/Dos prompt. if you use the terminal application set the locationof the mpiexecandOpenSees inthe PATH env variable
if you have batch runs to do it might be more efficient to be running multiple sequential jobs, e.g. if you can run 16 jobs at once (assuming memory is not our problem) you may obtain a speedup of about 16 versus what you are seeing now. what slows this perfectly scalable batch processing approach down is disk writes (if all 16 writing to a common disk the process can be painfull if you are writing lots of data).
Re: OpenSeesSP freezes
Dear fmk,
thank you very much for your detailed reply.
I'm so sorry but I can't understand answer 3.
I feel so shame on my ignorance, but I don't see the difference between command prompt (cmd.exe) and terminal application/Dos prompt.
(mpiexec and opensees are included in the PATH environment)
and your suggestion about multiple sequential jobs are quite amazing idea.
I should consider modifying my code to run opensees in multiple.
(currently, the analysis should be done one after another. because the following analysis is based on the previous one.)
thank you very much for your detailed reply.
I'm so sorry but I can't understand answer 3.
I feel so shame on my ignorance, but I don't see the difference between command prompt (cmd.exe) and terminal application/Dos prompt.
(mpiexec and opensees are included in the PATH environment)
and your suggestion about multiple sequential jobs are quite amazing idea.
I should consider modifying my code to run opensees in multiple.
(currently, the analysis should be done one after another. because the following analysis is based on the previous one.)
Re: OpenSeesSP freezes
Hi
I am running a model in which I first do a static and then a transient analysis. I use OpensessSP with Mumps solver. I declared "system mumps" only once for the static analysis, but I get same error like you got first time. I get error when np>1, while it works well for np=1. How did you solve the problem?!
I am running a model in which I first do a static and then a transient analysis. I use OpensessSP with Mumps solver. I declared "system mumps" only once for the static analysis, but I get same error like you got first time. I get error when np>1, while it works well for np=1. How did you solve the problem?!
Re: OpenSeesSP freezes
i take it mumps works well for the static and then fails in the transient
Re: OpenSeesSP freezes
OK. Thanks.
Re: OpenSeesSP freezes
Thank you for your help.
Re: OpenSeesSP freezes
Hi,
I am also running a transient analysis after the static analysis using OpenSeesSP which gives me the same error. I have defined system Mumps only once for analysis and it also worked fine when np=1. Is there a way to solve this problem now? Thanks in advance!
Regards,
Junfei
I am also running a transient analysis after the static analysis using OpenSeesSP which gives me the same error. I have defined system Mumps only once for analysis and it also worked fine when np=1. Is there a way to solve this problem now? Thanks in advance!
Regards,
Junfei