Quantcast
Viewing all articles
Browse latest Browse all 27765

Terminal Servers randomly breaking, 30000ms service timeouts: UmRdpService, UxSms, dot3svc, Netman, Spooler

At different times over the weekend 3 of our production Terminal Servers failed and started experiencing what appear to be the same symptoms. System log shows the servers continually cycling through service timeouts, in some cases appears to be an order or the same services, on another server will not show any obvious pattern and different services, but still the same 30000ms timeouts on services every 30 seconds continually.

Messages as below taken from TS03 in the order they are occurring, loops this order continually but not the same on other servers.

Log Name:      System
Source:        Service Control Manager
Date:          29/01/2013 4:28:09 PM
Event ID:      7011
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      TS03.CHFTE.local
Description:
A timeout (30000 milliseconds) was reached while waiting for a transaction response from the UmRdpService service.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Service Control Manager" Guid="{555908D1-A6D7-4695-8E1E-26931D2012F4}" EventSourceName="Service Control Manager" />
    <EventID Qualifiers="49152">7011</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2013-01-29T05:28:09.000Z" />
    <EventRecordID>339266</EventRecordID>
    <Correlation />
    <Execution ProcessID="0" ThreadID="0" />
    <Channel>System</Channel>
    <Computer>TS03.CHFTE.local</Computer>
    <Security />
  </System>
  <EventData>
    <Data Name="param1">30000</Data>
    <Data Name="param2">UmRdpService</Data>
  </EventData>
</Event>

Log Name:      System
Source:        Service Control Manager
Date:          29/01/2013 4:28:39 PM
Event ID:      7011
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      TS03.CHFTE.local
Description:
A timeout (30000 milliseconds) was reached while waiting for a transaction response from the UxSms service.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Service Control Manager" Guid="{555908D1-A6D7-4695-8E1E-26931D2012F4}" EventSourceName="Service Control Manager" />
    <EventID Qualifiers="49152">7011</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2013-01-29T05:28:39.000Z" />
    <EventRecordID>339267</EventRecordID>
    <Correlation />
    <Execution ProcessID="0" ThreadID="0" />
    <Channel>System</Channel>
    <Computer>TS03.CHFTE.local</Computer>
    <Security />
  </System>
  <EventData>
    <Data Name="param1">30000</Data>
    <Data Name="param2">UxSms</Data>
  </EventData>
</Event>

Log Name:      System
Source:        Service Control Manager
Date:          29/01/2013 4:44:31 PM
Event ID:      7011
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      TS03.CHFTE.local
Description:
A timeout (30000 milliseconds) was reached while waiting for a transaction response from the dot3svc service.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Service Control Manager" Guid="{555908D1-A6D7-4695-8E1E-26931D2012F4}" EventSourceName="Service Control Manager" />
    <EventID Qualifiers="49152">7011</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2013-01-29T05:44:31.000Z" />
    <EventRecordID>339269</EventRecordID>
    <Correlation />
    <Execution ProcessID="0" ThreadID="0" />
    <Channel>System</Channel>
    <Computer>TS03.CHFTE.local</Computer>
    <Security />
  </System>
  <EventData>
    <Data Name="param1">30000</Data>
    <Data Name="param2">dot3svc</Data>
  </EventData>
</Event>

Log Name:      System
Source:        Service Control Manager
Date:          29/01/2013 4:45:01 PM
Event ID:      7011
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      TS03.CHFTE.local
Description:
A timeout (30000 milliseconds) was reached while waiting for a transaction response from the Netman service.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Service Control Manager" Guid="{555908D1-A6D7-4695-8E1E-26931D2012F4}" EventSourceName="Service Control Manager" />
    <EventID Qualifiers="49152">7011</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2013-01-29T05:45:01.000Z" />
    <EventRecordID>339270</EventRecordID>
    <Correlation />
    <Execution ProcessID="0" ThreadID="0" />
    <Channel>System</Channel>
    <Computer>TS03.CHFTE.local</Computer>
    <Security />
  </System>
  <EventData>
    <Data Name="param1">30000</Data>
    <Data Name="param2">Netman</Data>
  </EventData>
</Event>


Only known resolution is a reboot, eventually the problems reoccur.

At first was thought to be one of three possible HP Printers in use with unknown drivers installing their driver and services causing the issue, this has been seen before to cause very similar issues. This doesn't seem likely as one of the affected Terminal Servers is External only, while the other 2 are Internal only. If there was a user with a bad printer/driver being redirected (that shouldn't install a driver anyway) there's no way I can see them hitting all 3 TS. Regardless, rebooting servers, deleting the HP Drivers, disabling users ability to install printer drivers (GPO) did not resolve. Disabling the HP services and even deleting from registry (as they set themself back to automatic) still no change.

The only changes in our environment are the introduction of the Fuji Xerox GPD to replace model specific drivers over 2 weeks ago now. They were pre-installed without major issue and have been in use since. Otherwise I can't find anything. Times the servers appear to have been originally affected are below:

TS03 (External) - 28/01 11:20AM - Began with Spooler Service showing 30000 timeout continous, no other services logging the same. Later that day it then started on the other mentioned services as well until rebooted on 29/01.
TS06 (Internal) - 29/01 1:52AM - Began cycling in order UmRdpService, UxSms, Netman a few times then Spooler jumped in on the end of that order.
TS07 (Internal) - 25/01 5:42:56PM - Exact same behavior as TS06

On each server around the time affected it has this entry:

Log Name:      Application
Source:        Desktop Window Manager
Date:          25/01/2013 5:42:26 PM
Event ID:      9009
Task Category: None
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      TS07.CHFTE.local
Description:
The Desktop Window Manager has exited with code (0x40010004)
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Desktop Window Manager" />
    <EventID Qualifiers="16384">9009</EventID>
    <Level>4</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2013-01-25T06:42:26.000Z" />
    <EventRecordID>374539</EventRecordID>
    <Channel>Application</Channel>
    <Computer>TS07.CHFTE.local</Computer>
    <Security />
  </System>
  <EventData>
    <Data>0x40010004</Data>
  </EventData>
</Event>

But performing a filter on the log shows this entry hundreds of times all over the place - most likely just coincidence as it's logged hundreds of times prior to ever seeing this occur.

There is also no common logon in the security logons that logged on at the time issues occurred.

This all appeared to be fixed after the HP Driver/Service removal and reboots, but later in the day TS06 became reaffected followed by TS03 tonight (now approx. 11:10PM). So appears to be random in when it affects the servers.

How do I find out more information about even what is causing this? Is there another log or something I'm missing? I'm beyond frustrated with this.


Viewing all articles
Browse latest Browse all 27765

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>