Kolab2 Slapd hanging - master/slave replication issues

Tue Mar 15 10:00:48 CET 2005

Hi,

On Tuesday 15 March 2005 10:08, Dieter Kluenter wrote:
> > We have a big Kolab2 server with about 350 users on it and 2 slave servers
> > in remote geographic locations. The slave servers have between 50 and 100
> > users each.
> 
> These are not large numbers, in terms of directory speaking :-)
I know :-) But Kolab installation-wise its not too bad for one P4 with an IDE HDD,
and 1 GB memory. You have to remember that Kolab makes a _lot_ of LDAP
request right through the message pipeline.

> Please give some more details
> - OpenLDAP versions of master and slaves
openldap-2.2.17-2.2.0 (OpenPKG)

> - Contents of DB_CONFIG
D.N.E. As a default this is not configured on Kolab (1 and 2). I think this is
a pretty big oversight.

> - cachesize in slapd.conf
Not configured, defaults in other words

> - idlecachsize in slapd.conf
Not configured, defaults in other words

> - indices in slapd.conf
index   objectClass     eq
index   uid             eq
index   mail            eq
index   alias           eq

> - database definition in slapd.conf
database        bdb
checkpoint      128 10
directory       /kolab/var/openldap/openldap-data

> - BerkeleyDB version and number of patches applied
db-4.2.52.2-2.2.0 (OpenPKG)

> > 2) When slapd hangs we need to do a db_recover to get it back up and running.
> > After having to run db_recover on the slave as well as the master servers the
> > databases have now become inconsistent.
> 
> A database corruption occurs only with heavy write load and
> insufficiened cache size or a flag DB_TXN_NOSYNC set in DB_CONFIG.
As I suspected. I have a persistent 200Mb free memory on the box and I'm
thinking of making the cache nice and big. 

> > So my questions:
> >
> > 1) What can I do to make slapd more robust? Pre-forking, more
> >    children processes?
> 
> That depends on your OS and hardware as well as on the number of
> connections in a given period.
Is there a good way to measure number of connections per second using OpenLDAP?

> > 2) How easy is it to re-sync the master-slave databases? Can I stop the servers,
> > copy the master dbs to the slaves and start up again?
> 
> You may slapcat the master and slapadd on the slave.
Copying the database across is the recommended way in the OpenLDAP Admin Guide.
I have no issue with using slapcat and slapadd though. This can be done at any stage?

> > Any tuning/optimisation tips would be greatly appreciated.
> 
> I would like to, but a few more informations would be helpful :-)
Well, I hope these details help.

Thanks for the interest!

Regards,
-- 
Stephan  Buys
Code Fusion cc.
Tel: +27 11 673 0411
Mobile: +27 83 294 1876
Email: s.buys at codefusion.co.za

E-mail Solutions, Kolab Specialists.
http://www.codefusion.co.za