I have a Windows server (2022) with two Samsung 990 Pro SSDs of 2TB. I've had some weird problems with one of them disappearing from time. What happens is that every 2 months or so, the disk in question, does not exist anymore: diskpart or Get-PhysicalDisk (in PS) simply do not list the disk anymore. The only thing to do at that time is a complete powerdown and restart, a simply restart in the OS is not sufficient.
At first I thought it was an issue with the motherboard, so I got in touch with the manufacturer and -surprise!- they told me to make sure it wasn't a problem with the disk. After some back and forth, I decided to explore a potential issue with the disks, simply to avoid the hassle of replacing the mobo and then still have the problem.
Examining the situation of the disks was not so easy, because this is Server Core installation, so no GUI, but I was able to do some analysis, which revealed a shocker: running MS's diskspd showed a completely abysmal performance for both disks. Both read and write are just below 50MiB/s which is way lower than the specs of the 990 Pro.
So I now have several questions:
Are the two problems (disk disappearing from time to time) linked?
Could the speed problem by caused by the motherboard (it is an ASRock X570S PG Riptide)?
Could it be that the SSDs are counterfeit? And how can I check this?
Any suggestions on further analyzing this?
Clarification:
Server logs: nothing shows up in event viewer
Age of the drives: they're a year old and haven't been used intensively
Smart readings:
This is the output I got from Samsung DC Toolkit:
Disk Number: 1:c | Model Name: Samsung SSD 990 PRO with Heatsink 2TB | Firmware Version: 0B2QJXG7
Bytes
Description
Value
0
Critical Warning
0x00
2:1
Composite Temperature
0x0142
3
Available Spare
0x64
4
Available Spare Threshold
0x0A
5
Percentage Used
0x02
47:32
Data Units Read
0x000000000000000000000000011BD521
63:48
Data Units Written
0x000000000000000000000000010D94FB
79:64
Host Read Commands
0x0000000000000000000000000DD8604F
95:80
Host Write Commands
0x0000000000000000000000001282EACA
111:96
Controller Busy Time
0x00000000000000000000000000009963
127:112
Power Cycle
0x00000000000000000000000000000020
143:128
Power On Hours
0x00000000000000000000000000001F93
159:144
Unsafe Shutdowns
0x00000000000000000000000000000014
175:160
Media and Data Integrity Errors
0x00000000000000000000000000000000
191:176
Number of Error Information Log Entries
0x00000000000000000000000000000000
195:192
Warning Composite Temperature Time
0x00040880
199:196
Critical Composite Temperature Time
0x00000000
201:200
Temperature Sensor 1
0x0142
203:202
Temperature Sensor 2
0x0149
205:204
Temperature Sensor 3
0x0000
207:206
Temperature Sensor 4
0x0000
209:208
Temperature Sensor 5
0x0000
211:210
Temperature Sensor 6
0x0000
213:212
Temperature Sensor 7
0x0000
215:214
Temperature Sensor 8
0x0000